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GENERATING AND PROVIDING INFORMATION ABOUT 
EXPECTED FUTURE PRICES OF ASSETS AND 
VISUALIZATION OF ASSET INFORMATION 

BACKGROUND 

5 lliis invention relates to gen^ating and providing infonnati^ 

expected future prices of assets, and to visualization of asset infonnation. 

Among the kinds of information available at web sites on the Internet are 
current and historical prices and volumes of stock transactions, prices of 
put or call options at specific strike prices and expiration dates for various 
10 stocks, and theoretical prices of put and call options that are derived using 
formulas such as the Black-Scholes formula. Some web sites give 
predictions by individual experts of the future prices or price ranges of 
specific stocks. 

A call option gives the holder a right to buy an underlying marketable 
1 5 asset by an expiration date for a specified strike price. A put option gives 
. an analogous right to sell an asset. Options are called derivative securities 
because they derive their values jfrom the prices of the underlying assets. 
Examples of underlying assets are corporate stock, commodity stocky and 
currency. The price of an option is sometimes called the premium. 

20 People who buy and sell options are naturally interested in what 
) appropriate prices might be for the options. One well-known formula for 

determining the prices for call and put options under idealized conditions 
is called the Black-Scholes formula. Black-Scholes provides an estimate 
of call or put prices for options having a defined expiration date, given a 
25 current price of the underlying asset, an interest rate, and the volatility rate 
(sometimes simply called volatility) of the asset. Black-Scholes assumes 
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constant interest rates and volatility, no arbitrage, and trading that is 
continuous over a specified price range. 

Information about investment assets such as corporate securities is often 

presented as_tableS -Of_yalues.or.ratios.of.Yalues^foT=sucGessive=tiTne^ 

. 5 periods. 

Sometimes graphs or visualization devices are used to provide a more 
intuitive view of the information. 

One on-line service, Momingstar.com, uses a scatter plot in its 
Mpmingstar Investment Radar, URL 
10 (http://screen.moTTungstarxom/InvestmentRaidar/Investm 

Each point in the plot represents risk versus capitalization of an asset in a 
portfolio. 

Another on-line facility, FalconEye, URL 

(http://www.falconeye.com/falconeye/tracker/index.htnil), displays a 
15 periscope-like view of a simulated cloud formation that represents 

a multi-dimensional density map of all 6000+,Nasdaq 
stocks, sorted in real-time by FalconEye Viz-Alerts™ 
(customizable indicators) that were created for the vertical 
and horizontal axes. Each stock is like a pixel on the screen 
20 and each color represents the density of stocks depicted in 

that section of the Tracker Live Map The distribution of 

density allows you to instantly see the real-time technical 
pressures on the market and gives you the knowledge to 
trade more efiSciently and productively. 

25 ValuEngine, URL (http://valuengine.eom/servlet/ValuationSummary#), 
displays gr^hs of stock prices that include historical prices to a current 
date followed by forecast price trends for fixture periods, including 
forecast ranges above and below the forecast price trends. 
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SUMMARY 

In general, in one aspect, the invention features a method in which data is 
received that represents current prices of options on a given asset An 
estimate is derived from the data of a corresponding imphed probability 
5 disffibution of the priceTof the asset ataTutSfe'time. Mbrmation about tffi^ 
probability distribution is made available within a time frame that is usefril 
to investors, for example, promptly after the current option price 
information becomes available. 

Implementations of the invention may include one or more of the 
10 following features. The data may represent a finite number of prices of 
options at spaced-apart strike prices of the asset. A set of first differences 
may be calculated of the finite number of prices to form an estimate of the 
cumulative probability distribution of the price of the asset at a future 
time. A set of second differences may be calculated of the finite number of 
15 strike prices from the set of first differences to form the estimate of the 
probability distribution fimction of the price of the asset at a future time. 

In general, m another aspect, the invention features a method m which a 
real tune data feed is provided that contains information based on the 
probability distribution. 

20 In general, in another aspect, the invention features a method that includes 
providing a graphical user interface for viewing pages containing financial 
information related to an asset; and when a user indicates an asset of 
mterest, displaying probabiUty information related to the price of the asset 
at a future time. 

25 In g^eral, m another aspect, the invention features a method that mcludes 
receiving data representing current prices of options on a given asset, the 
options being associated with spaced-apart strike prices of the asset at a 

Attorney Docket 11910-002001 
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future time. The data includes shifted current prices of options resulting 
from a shifted underlying price of the asset, the amount by which the asset 
price has shifted being dififra-ent from the amount by which the strike 
prices are spaced apart. An estimate is derived from a quantized implied 
probabilitjrdistributioirof tK^picFUf the asset1tTtuture"time, the 
elements of the quantized probability distribution being more finely 
spaced than for a probability distribution derived without the shifted 
current price data. 

In general, in another aspect, the invention includes deriving from said 
10 data an estimate of an implied probability distribution of the price of the 
asset at a future time, the mathematical derivation including a smoothing 
operation. 

Implementations of the invention may include one or more of the 
following features. TTie smoothing operation may be performed in a 
15 volatility domain. 

In general, in another aspect, the invention includes deriving a volatility 
for each of the future dates in accordance with a predetermined option 
pricing formula that links option prices with strike prices of the asset; and 
generating a smoothed and extrapolated volatility fimction. 

20 Impl^entationsofthe invention may include one or more of the 

following features. The volatility fimction may be extrapolated to a wider 
range of dates than the fiiture dates and to other strike prices. The 
smoothed volatility function may be £^plicable to conditions in which the 
data is reliable under a predetermined measure of reliability. The implied 

25 volatility fimction formula may have a quadratic form with two variables 
representing a strike price and an expiration date. Hie coefficients of the 
implied volatility fimction formula may be determined by applying 

Attorney Docket 11910-002001 
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regression analysis to ^proximately fit the implied volatility function 
formula to each of the implied volatilities. 

In general, in another aspect, the invention features a method that includes 
receiving data re preseDtin R CurreDt pricesjofoptionson assets belonging^to 
5 a portfolio, deriving from the data an estimate of an implied multivariate 
distribution of the price of a quantity at a future time that depends on the 
assets belonging to the portfoHo, and making information about the 
probability distribution available within a time frame that is useful to 
investors. 

10 In general, in another aspect, the mvention features a method that includes 
receiving data representing values of a set of factors that influence a 
composite value, deriving from the data an estimate of an implied 
multivariate distribution of the price of a quantity at a future time that 
depends on assets belonging to a portfolio, and making information about 

15 the probability distribution available within a time .frame that is useful to 
investors. 

Implementations of the invention may include one or more of the 
following features. The mathematical derivation may include generating a 
multivariate probabihty distribution function based on a correlation among 
20 the factors. 

In general, in another aspect, the invention features a grs^hical user 
interface that includes a user interface element ad^ted to enable a user to 
indicate a future time, a user interface element adapted to show a current 
price of an asset, and a user interface element adapted to show the 
25 probability distribution of the price of the asset at the future time. 



Attorney Docket 1 1910-002001 



5 



wo 02/17120 



PCT/USOl/25753 



In general, in one aspect, the invention features, a method that includes 
continually generating current data that contains probability distributions 
of prices of assets at fiiture times, continually feeding the current data to a 
recipient electronically, and the recipient using the fed data for services 
~5 provided^to users. ' — = 

In general, in another aspect, the invention features a method that includes 
receiving data representing current prices of options on assets belonging to 
a portfolio, receiving data representing current prices of market 
transactions associated with a second portfolio of assets, and providing 
10 information electronically on the probability that the second portfolio of 
assets will reach a first value given the condition that the first portfoUo of 
assets reaches a specified price at a future time. 

In general, in another aspect, the invention features a method that includes 
receiving data representative of actual market transactions associated with 
15 a first portfolio of assets; receiving data representative of actual market 
transactions associated with a second portfolio of assets; and providing 
information on the expectation value of flie price of first portfolio of assets 
given the condition that the secotid portfolio of assets reach a first 
specified price at a specified future time through a network. 

20 In general, in another aspect, the invention features a method that includes 
evaluating an event defined by a first multivariate expression that 
represents a combination of macroeconomic variables at a time T, and 
estimating (e.g., using Monte Carlo techniques) the probability that a 
second multivariate expression that represents a combination of values of 

25 assets of a portfolio wiU have a value greater than a constant B at time T if 
the value of the first multivariate expression is greater than a constant A. 
The market variables represented by the first multivariate expression can 

Attorney Docket 1 1910-002001 
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include macroeconomic factors (such as interest rates), market preferences 
regarding the style of company fundamentals (large/small companies, 
rapid/steady growth, etc.), or market preferences for industry sectors. 

In general, in anotber aspect^.the.inv cntion features a method-that iiiclndes 
5 defining a regression expression that relates the value of one variable 
representing a combination of macroeconomic variables at time T to a 
second variable at time T that represents a combination of assets of a 
portfolio, and estimating the probability that the second variable will have 
a value greater than a constant B at time T if the value of the first variable 
10 is greater than a constant A at time T, based on the ratio of the probability 
of X being greater than A under the regression expression and the 
probabihty of x being greater than A. 

In general, in another aspect, the invention features a method that includes 
defining a current value of an option as a quadratic expression that 
1 5 depends on the difference between the cunent price of the option aad the 
current price of the underlying security, and usmg Monte Carlo techniques 
to estimate a probability distribution of the value at a future time T of a 
portfolio that includes the option. 

The invention takes advantage of the reaUzation that option prices for a 
20 given underlying asset are indicative of the market's prediction of the of 
the risk-neutral price of the underlying asset in the future (e.g., at the 
expiration of the option). Option price data may be used to derive the 
market's prediction in the form of an imphed probability distribution of 
future risk-neutral prices. Additional explanation of the significance of the 
25 phrase risk-neutral is contained in the Appendix. 

The implied probability distribution and other information related to it 
may be made easily available to people for whom the information may be 

Attorney Docket 11910-002001 
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useful, such as those considering an investmmt in the underlying asset, or 
a brokerage film advising such an investor. 

In general, in another aspect, the invention features a method that includes 
(a) displaying to a user a circular visualization element having sectors 
5 arranged around a center of the element, the sectors respectively 

corresponding to different groups of assets, and (b) in each of the sectors, 
displaying an array of visual elements representative of respective assets 
belonging to the group to which the sector corresponds, the visual 
elements being arrayed with respect to distance from the center in 
10 accordance with magnitudes of performance of the assets during a recent 
period. 

Implementations of the invention may include one or more of the 
following features. The visual elements comprise displayed dots, one for 
each of the assets. The visual elements exhibit visible characteristics that 

1 5 correspond to categories of the assets within the group. The categories of 
the assets within the group correspond to different capitalizations. The 
dots are arranged along a radius of the sector to which they belong. Dots 
that would otherwise lie on the radius at a given distance fix>m the center 
are displayed at different angular positions near to the radius. Each sector 

20 has an angular extent that represents the firaction of asset itenas in the 
sector relative to the total number of asset items in the universe being 
plotted. The circular visualization element is subdivided into rings having 
respectively different distances firom the center. The rings are displayed in 
difierent colors. The magnitudes of performance of the assets are 

25 measured in percentage price change. The recent period comprises a 

trading day on an asset market. The assets comprise securities issued by 
corporations. 

Attorney Docket 1 1910-002001 
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In general, in another aspect, the invention features a method that includes 
displaying to a user a visualization elem^t that indicates the odds of a 
performance measure of an asset heing within specified ranges of 
identified values of the performance measiwe at a succession of times in 
^5 tl^filture. 

hnplementations of the invention may include one or more of the 
following features. The performance measure comprises a price of the 
asset or a return percentage or a tax-adjusted return percentage. The 
visualization element includes stripes superimposed on a graph of the 

1 0 performance measure over time, each of the stripes representing one of the 
specified ranges. Each of the stripes begins at a current time and becomes 
broader as it extends to future times. A graphical device shows actual 
historical values of the performance measure, e.g. in the form of a line 
graph one end of which joins the visualization element at a point that 

15 represents a current date. The visualization element includes two portions, 
one of the portions representing the odds prior to a specified date based on 
one assxunption, the other of the portions representing the odds after the 
specified date based on another assumption. The specified date is a date on 
which tax effects change &om the one assumption to the other assumption. 

20 hi general, in another aspect, the invention features a method that includes 
displajring to a user a visualization element having graphical indicators of 
the relative performance of a selected asset compared with the 
performance of groups of assets in each of a succession of time periods, 
each of the groups comprising assets representing a common style. The 

25 relative performance is determined using an asset class factor model. 

Among the advantages of the invention are one or more of the following: 
Investors and prospective investors in an underlying asset, such as a 

Attorney Docket 11910-002001 
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publicly-traded stock, are given access to a key additional piece of current 
information, namely calculated data representing the market's view of the 
&ture price of the stock. Brokerage firms, investment advisors, and other 
companies involved in the securities markets are able to provide the 
=5~~iMbrmationWrelatM^^ic^te"fltei^ 

enabled to quickly visualize and grasp the significance of data that would 
otherwise be more difGcult to understand. 

Other features and advantages will become apparent from the following 
description and from the claims. 

10 DESCRIPTION 

Details of implementations of the invention are set forth in the figures and 
the related description below. 

Figures 1, 2, and 3 are gr^hs. 

Figure 4 is a block diagram. 

15 Figures 5, 6, and 7 are web pages. 

Figures 8 and 9 illustrate user interfaces. 

Figure 10 shows data structures. 

Figures 11 through 15 show visualization techniques. 

In general, the price of a call or put option is determined by buyers and 
20 sellers in the option market and carries information about the market's 
prediction of the expected price of the underlying asset at the expiration 
date. (The information does not include the premium that investors require 
for bearing risk, which must be estimated separately. The average long- 
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tenia value of the risk premium is about 6% per year for all stocks and may 
be adjusted for an individual stock's historical responsiveness to broader 
market movements.) 

Th e information carried in the pri_ces_of Qptions having various strike 
5 prices and expirations is used to derive probability distributions of the 
asset's price at future times and to display corresponding information to 
investors, for example, on the World Wide Web» 

Basic method 

We first define some relevant terms. We define x as the strike price, c{x) 
10 as the theoretical call price function (the price of the call as a function of 
strike price), as the theoretical put price function, F{x) as the 
cumulative distribution function (cdf) of the price of the underlying asset 
at expiration; and^jc) as the probability density function (pdf) of the asset 
price at expiration. By definition, ^(x) =F'(x) (i.e., the probability density 
1 5 function is the derivative of the cumulative distribution function). 

The relationship between c(x), p{x\f{x)j and F{x) can be succinctly stated 
as: 

F{x)^c\x)^\^p\xy, (la) 
A^)-c\x)^p\xy (lb) 
20 In words, the pdf is the second derivative of either the call price function 
or of the put price function A simple proof these relationships is given in 
the Appendix. The Appendix also contains other detailed information 
relating to the features of the invention. 

This so-called "second-derivative method" for computing implied 
25 probability distributions firom option price data is known in the academic 
literature, but apparently not very well known. For example, the standard 

Attorney Docket 11910-002001 
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textbook "Options, Futures, and Other Derivatives," by John C. Hull 
(Fourth Edition, 1999; Prentice-Hall) mentions implied probabilities, but 
not the second-derivative method. Perh^s the best reference that we have 
been able to find is J. C. Jackwerth and M. Rubinstein, "Recovering 
—5^ — — probability^distrib^tiraT^from"option^e^"7rFinance, voir51,pp. 1611- 
1631 (1996), which has only six prior references. This paper cites D. T. 
Breeden and R, H. Litzenberger, "Prices of state-contingent claims 
implicit in option prices," J. Business, vol. 51, pp. 631-650 (1978) as the 
originator of a second-derivative method, although the latter paper 
1 0 nowhere mentions probabilities. 

Approximating f(x) from finite bid and ask option prices 

Equations (la) and (lb) are obtained by assuming that the variable x is 
continuous and ranges from 0 to infinity. In practice, options are usually 
traded within certain price ranges and only for certain price intervals (e.g., 
15 ranging from $110 to $180 at $5 intervals). Thus, the call and/or put 
option prices are known only for a finite subset of strike prices. Under 
such circumstances, estimates of Equations (la) and (lb) can be computed 
by taking diflFerences instead of derivatives as follows. 

We assume that the option prices c{x) and p{x) are quoted for a finite 
20 subset of equally-spaced strike prices x = n A, where n is an integer, and A 
is the spacing between quoted prices. Define c„ = c{n ls)^pn = p{n A). Then 
the first derivatives c\x) and p\x) at jc = (n+ Vi)A may be estimated by the 
first differences: 



A • 



(7a) 



25 Pl^n-^—^^ (7b) 
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The coKesponding estimates of the cunmlative distribution function: 
^».i/2 = ^((«+i)A)are 

^nm = l+^^iA2 (8a) 



^ Pn+l/2 (8b) 

The second derivatives c"{x) and p"(x) atx = n A may likewise be 
estimated by the second differences, i.e:, differences of the estimates of 
the first derivatives: 



1 0 Either of these estimates of the second derivatives may be used as an 
estimate of the probability density values at x = ti A, i.e.,XwA) : 

l-Korl-c: (10) 

Moreover, the market prices of call and put options are usually given in 
terms of a bid-ask spread, and thus either the bid price or the ask price (or 

1 5 some intermediate value) may be used as the call or put option price. By 
using the bid and ask prices for both the call option and the put option, 
four estimates of F(x) and f(x) may be obtained. These estimates may be 
combined according to their reliability in any desired way. For example, 
one might use the estimate derived from the put bid price curve for values 

20 of jc less than the current price s of the underlying asset, and the estimate 
derived from the call bid price curve for values of greater than 
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Examples of Cn, pn, P„^\n > ^nd are shown in figures 1, 2, and 3 using 
the data of TABLE 1 (see below). 

Tabular data 

TABLE 1 below shows sample bid prices of call and put options for strike 
prices of an asset ranging from $110 to $180 at $5 intervals and the 

cumulative distribution values ^^P^^''^t)ility density values 
computed according to Equations (7)-(10) above. 

In the table, the values for F^^y^ correspond to strike prices that are mid- 
way between the two strike prices used to compute F^^y2 * Thus, the 
cumulative distribution value shown to the right of the strike price $110 
actually corresponds to the strike price $1 12.5, and the value to the right 
of the strike price $115 actually corresponds to strike price $1 17.5, and so 
forth. 
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Stnke 
price 



price jjgji ppjjjg jjgll pp|j.g 



1 in 
nu 

115 


37 7/fl 


u 


1 £U 






125 


28 1/8 


0.05 


130 


23 3/8 


0.125 


135 


19 


0.175 


140 


14 7/8 


0.225 


145 


11 


0.35 


150 


7 3/4 


0.525 


155 


5 3/8 


0.6 


160 


3 3/8 


0.7375 


165 


2 1/16 


0.825 


170 


13/16 


0.8875 


175 


5/8 


0.925 


180 


1/4 





0 

0.025 



0 

0.025 
0.075 
0.05 
0.05 
0.125 
0.175 
0.075 
0.1375 
0.0875 
0.0625 
0.0375 



1 P»^\n from /T from 

Put pnc "^^'^ " 

put price put price 



-1 / 


0 


u 


1/ 




0 019R 


3/1 


0,0125 


0 


1/ 


0.0375 


0.025 


7/1 


0.0875 


0.05 


71 


0.15 


0.0625 


1 5/ 


0.2375 


0.0875 


213/1 


0.3875 


0.15 


43/ 


0.5 


0.1125 


TM 


0.6 


0.1 


101/ 


0.725 


0.125 


13 7/ 


0.825 


0.1 


1 


0.85 


0.025 


221/ 


0.925 


0.075 



26 7/ 



10 



Dynamic estimates for Fix) and f[x\ 

In Equations (7)-(10), the call and put option prices were assumed to be 
static in the calculation of the cumulative distribution fiinction F(x) and 
probability density functionyj^x) for a finite subset of strike prices jc = n A. 
In the real world, the price s of the underlying asset changes with time, 
and there will be a corresponding change in option prices. As a first order 
approximation, if the price s increases by a small amount 5, then the 
option price curves will effectively shift to the right by the amount 5. 
(Here, 6 may be either positive or negative. For a more precise discussion 
of the shift, see the Appendix.) As a result, the price c(x) or p(x) now 
quoted at strike price x may be used as an estimate for the option price on 
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the previous price curve at strike price x' = - 5; As a result, the prices on 
the previous curve at a new discrete subset of strike prices x-nA-5 
become effectively visible. Given enough movements of the underlying 
price, therefore, we can effectively compute estimates ofc(x), p(x), F(x) 
^5 WSlWf(^yfor a suHsefof sffike prices x tfiH^is much more closely spaced 
than the subset available at any one time. 

Extrapolating and smoothing probability distributions. 

In a typical options market, the option prices are available only for certain 
expiration dates. In addition, the option prices are more reliable for 
1 0 options that are actively traded, which are typically nearer-term options at 
strike prices near the underlymg price. It is therefore desirable to 
extrapolate and interpolate probability distributions to times other than 
actual expiration dates and to wider ranges of strike prices. 

Any standard extrapolation and smoothing techniques may be used 

15 directly on the cumulative distribution values F^y^ probability density 

values /„ to give a smoothed and extr^olated estimate ofF(x) or f(x). 
Similarly, given such estimated curves for a discrete subset of future times 

standard interpolation and extrapolation techniques may be used to 
estimate such curves for other specified values of 7, or for a continuous 
20 range of r >0. 

A less direct but useful approach is to perform extrapolation and 
smoothing on an implied volatility fimction, which is then used to 
calculate the other functions, such as c(x), /?(jc), F{xX &ndf[x). The 
volatility rate of an asset (often simply called its volatility) is a measxure of 
25 uncertainty about the returns provided by the asset. The volatility rates of 
a stock may typically be in the range of 0.3 to 0.5 per year. 
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An advantage of performing extrapolation and smoothing on implied 
volatility curves is that different types of volatility curves (so-called 
'Volatility smiles") are known and can be used as a guide to the 
extrapolation and smoothing process to prevent "overfitting" of certain 
==5~ — unreUablrdata^ints. " 

The standard method of computing implied volatilities is to invert the 
Black-Scholes pricing formula (see Appendix) for the actual call price c(x) 
or put price p(x) of an underlying asset at a given strike price given the 
miderlying price 5 (current price of asset), risk-free rate of interest r, and 

10 and r (expiration date). When this is done for a range of values of jc, an 

estimate of an implied volatility curve o(x) is obtained. This curve may be 
smoothed and extrapolated by any standard method to give a smoothed 
curve a (x). Then corresponding smoothed put and call price curves may 
be computed using the Black-Scholes pricing formula and differentiated 

15 once or twice to give a smoothed cdf or pdf. Finally, given such estimated 
curves for a discrete subset of future times T, standard interpolation and 
extrapolation techniques may be used to estimate such curves for other 
specified values of J, or for a continuous range of T > 0. 

Another new way to compute implied volatilities is first to compute a 

20 finite subset of cdf values F^y2 ^nd then to invert the Black-Scholes cdf 
formula (see Appendix) at these values. When this is done for a range of 
values of x^ an estimate of a generally dififerent implied volatility curve 
Oi{x) is obtained, called the cdf-implied volatility curve. Again, this curve 
may be smoothed and extrapolated by any standard mefliod to give a 

25 smoothed curve cf \{x). Then a corresponding smoothed cdf may be 

computed from the Black-Scholes cdf formula, and differentiated once to 
give a smoothed pdf. Finally, again, given such estimated curves for a 
discrete subset of friture times 7, standard interpolation and extrapolation 
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techniques may be used to estimate such curves for other specified values 
of r, or for a continuous range of T> 0. 

Some advantages of using the cdf-implied volatility curve rather than the 
conventional implied volatility curve are that the computations are 
5 simpler, at least from an estimate o{F(x)y and that it fits better with the 
multivariate techniques to be discussed below. 

A particular method for finding a smoothed and extrapolated implied 
volatility curve a i (jc,r) as a fimction of both strike price x and time T to 
expiration is as follows. The volatility curve is assumed to be 
10 approximated by a quadratic formula 

S\ ix,T) = ao + ai X + Jc^ + ^3 T-^ qa t + asxT, (14) 

The coefiBcients {at} are determined by regression to fit the available data 
regarding ai(jc, 7) as closely as possible. Given the smoothed curve Sr\ {^, 
7), correspondmg smoothed cdfs for different Jci and Ts) may be 

IS computedfiromtheBlack-Scholescdf formula for each timer, and 

differentiated once to give a smoothed pdf. An alternative procedure, with 
numerical advantages, is to use a quadratic fit like the above for a fimction 
a {x,T), and then invert the Black-Scholes cdf to find a i (x, 2). See the 
Appendix for the academic history of such approximations of S (jc,7). 

20 Another useful variation is to fit a (x,7) with a quadratic function of x at 
tunes T which are specific expiration dates, then linearly interpolate at 
other times r. 

Treatment of multiple assets 

The techniques described so far give probability distributions for the 
25 future values of a single asset based on option price data for that asset. 
However, in many cases an investor may be concerned with multiple 
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assets, for example all of the stocks in his or her portfolio, or in a mutual 
fund, or in a certain index. Moreover, the investor may be concerned with 
the relations between one group of assets and another. 

A Reneral me tfaod for dealingjftdth.such.questions is.to.Rencrate - 
5 multivariate probability distributions for all assets of interest. A 

multivariate cdf may be written as F(xu X2, . . where the variables (xu 
xj, . . x„) are the values of the n assets of interest. 

We will assume that we know from the techniques described above or 
otherwise the marginal cdfs Fi(x\} for each of the individual variables. As 

10 a first step, we may define for each xi a function yi(x\}, called a '*warping 
function," such ihziy\(x\) is a standard normal (Gaussian) variable with 
mean 0 and variance 1. This is simply done by defimng3;i^Xi^ such that 
Fi(x\) = N(y\(x])) for all values of x, where N(x) denotes the cdf of a 
standard normal variable. The function y\(x-^ may be simply described in 

15 terms of cr,(jc,.). See the Appendix, Under mild technical conditions such 
as haying a marginal cdf fliat varies monotonically, such a warping 
fimction y\(x^ has a well-defined inverse warping fimction Xi(y^, 

Second, we assume that we can find the historical pairwise correlations 
between the waiped standard normal variables These correlations 
20 may be computed by standard techniques from any available set of 

historical asset price data. We denote by C the ti x « correlation matrix 
whose entries are these historically-based correlations. Because each of 
the variables y\(x0 is standard normal, the diagonal terms of C are all equal 
tol. 

25 Now let Fdxu • . Jc„) denote the cdf of a multivariate Gaussian random n- 
tuple with zero mean and covariance matrix C. Define 
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Then F{xiy ;r2, . . x„) is a multivariate cdf that (a) has the correct (given) 
marginal cdfs Fi(xJ; and (b) has the correct (historical) correlations 
between the warped standard nor maWagables v./;c( ). JWe,u se,^ 
5 answer questions involving the variables {x\ , ^2, . . x„). 

For example, the investor might have a portfolio consistmg of a given 
quantity of each of these assets. The value of such aportfoUo is the sum 

X=hiXl'^h2X2-^ ...-^hnXn, (15) 

where hi represents the quantity of the zth asset in the portfolio. The 
* 1 0 investor might be interested in an estimate of the probability distribution 
of the value ;c of the whole portfolio. 

Such an estimate may be obtained by Monte Carlo simulation. For such a 
simulation, a large number N of samples fiom the multivariate Gaussian 
ddfFc(Ki, ...jjn) maybe generated. Each sample (yu ...,7n) maybe 
15 converted to a sample {x] , jc2, . . , x„) by using the inverse warping 

functions Xi(yO. The valuex of the total portfolio may then be computed 
for each sample. From these //values of jc, the probability distribution of 
(e.g., its cdf F(i;) may be estimated. 

In practice, it is useful to save the iV^ multivariate samples in a large 
20 database. Then the cdf of any quantity whose value is a function of the 
variables (x^ X2, . . x„) may be estimated from this database. For 
(example, if the investor would like to know the cdf of some alternative 
portfolio with different quantities of each asset, this can be quickly 
determined from the stored database. 
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An investor may also determine the effect of one portfolio (or event(s) or 
variables such as interest rates, P/E ratios, public interest in a certain 
sector of the maiket) on another portfolio as follows. Assume that the first 
portfolio is represented by where 

(30) 

where each x/ may be viewed as the price of a portfolio component, and 
the second portfolio is represented by y, where 

y = g\Xl'^g2X2 + .^>+gnX„. (31) 

where each y, may be viewed as the price of a portfolio component or 
10 more broadly as any macro-economic variable (macroeconomic, 
fundamental, or sector related). 

Consider the '*what-if * question: letting A and B be given positive 
constants, if jc k i4 at time T, what is the probability that 3? >: B at time T. 
This question can be answered by creating a Monte Carlo database as 

15 above for the multivariate cdf F(xu JCi, . - Xn) corresponding to time T, 
identifying those samples for which jc ^ >4, and then using only these 
samples to estimate the probability that > B. More generally, any 
conditional cdf of the form F(x \ E) can be estimated similarly, where x is 
any fimction of the variables (jci, X2, . . Xn) and E is any event defined in 

20 terms of the variables {xu xiy . . x„). 

Similarly, suppose an investor would like to know whether it is reasonable 
to beUeve that a certain stock or portfolio x will have a value greater than a 
given constant A at time T. This kind of question can be addressed by 
estimating the conditional cdf of some other related and perhaps better- 
25 understood variable (or combination of variables) y at time T, given that x 
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> A. If the resulting distribution for y does not look reasonable, then the 
investor may conclude that it is unreasonable to expect that x>.A, 

Applications that use the probability distribution information 

A wide variety of techniques may be used to accumulate and process the 
5 information needed for the calculations described above and to provide the 
information to users directly or indirectly through third parties. Some of 
these techniques are described below. . 

As shown in figure 4, the probability distribution information can be 
provided to users from a host server 102 connected to a communication 
1 0 network 1 04, for example, a public network such as the hitemet or a 

private network such as a corporate intranet or local area network (LAN). 
For purpose of illustration, the following discussion aissumes that network 
104 is the Internet. \ 

The host server 102 includes a software suite 1 16, a financial database 
15 120, and a communications module 122. The communications module 122 
transmits and receives data generated by the host server 1 02 according to 
the conmiunication protocols of the network 104. 

Also connected to the network are one or more of each of the following 
(only one is shown in each case): an individual or institutional user 1 08, an 
20 advertisanent provider 1 10, a financial mstitution 1 12, a third party web 
server 1 14, a media operator 122, and a financial information pmvider 
106. 

The operator of the host server could be, for example, a financial 
information source, a private company, a vendor of investment services, or 
25 a consortium of companies that provides a centralized database of 
information. 
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The host server 102 runs typical operating system and web server 
programs that are part of the software suite 116. The web server programs 
allow the host server 102 to operate as a web server and generate web 
pages or elements of web pages, e g., in HTML or XML code, that allow 

eaclrusa^r08toTeceive"Wd"iflt^^t^ 

information generated by the host server* 

Software suite 116 also includes analytical software 118 that is configured 
to analyze data stored in the financial database 120 to generate, for 
example, the implied probability distribution of future prices of assets and 
10 portfolios. 

The financial database 120 stores financial information collected from the 
financial information providers 1 06 and computation results generated by 
the analytical sofbvare 118. The financial infomiation providers 106 is 
connected to the network 104 via a communicafion link 126 or the 
IS financial information providers may feed the information directly to the 
host server through a dialup or dedicated line (not shown). 

Figure 4 gives a fimctional view of an implementatibn of the invention* 
Structurally, the host server could be implemented as one or more web 
servers coupled to the network, one or more applications servers running 
20 the analytical software and other applications required for the system and 
one or more database servers that would store the financial database and 
other information required for the system. 

Figure 10 shows an example of a data feed ISO sent firom the financial 
information provider 106 to the host server 102 throng the 
2S corumunication link 126. Information is conununicated to the host server 
in the foim of messages ISl, 1S2. Each message contains a stream of one 
or more records 153 each of which carries information about option prices 
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for an wideriying asset. Each message includes header information 154 
that identifies the sender and receiver, the current date 155, and an end of 
message indicator 1 58, which follows the records contained in the 
message. 

5 Each record 153 in the stream includes an identifier 156 (e.g., the trading 
symbol) of an underl)dng asset, an indication 1 58 of whether the record 
pertains to a put or call, the strike date 1 60 of the put or call, the strike 
price 162 of the put or call, current bid-ask prices 164 of the underlying 
asset, bid-ask prices 166 for the option, and transaction volumes 168 

10 associated with the option. The financial information provider 1 06 may be 
an information broker, such as Reuters, Bridge, or Bloomberg, or any 
other party that has access to or can generate the information carried in the 
messages. The broker may provide information from sources that include, 
for example, the New York Stock Exchange and the Chicago Board of 

15 Options Exchange. 

The financial database 120 stores the information received m the 
information feed from the financial information providers and other 
information, including, for example, interest rates and volatilities. The 
financial database also stores the results generated by the analytical 
20 software, including probability distribution fiinctions with respect to the 
underlying assets and assets that are not the subject of options. 

The probability distribution information is generated continually (and 
essentially in real time) from the incoming options data so that the 
information provided and displayed to users is current. That is, the 
25 information is not based on old historical data but rather on current 
information about option prices. 
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In addition, other soft information can be accumulated, stored, and 
provided to users, including fundamental characteristics of the underlying 
assets, including prices, volatility values, beta, the identification of the 
industry to which the asset belongs, the jdeld, the price to book ratio, and 

=^5^==^ theievi^aprOth^infommtioncould^ro^ 

forecast dates, earnings forecasts, corporate action items, news items that 
relate to an industry, and the volume of institutional holdings. 

The messages from the information provider 106 may be sent in response 
to requests by the host server 102, the information may be sent to the host 

10 server 1 02 automatically at a specified time interval, or the information 

may be sent as received by the information provider from its sources. The 
financial database 120 may be maintained on a separate server computer 
(not shown) that is dedicated to the collection and organization of 
financial data. The financial database is organized to provide logical 

15 relationships among the stored data and to make retrieval of needed 
information rapid and effective. 

The user 1 08 may use, for example, a personal computer, a TV set top 
box, a personal digital assistant (PDA), or a portable phone to 
communicate with the network 104. Any of these devices may be running 
20 an Internet browser to display the graphical user interface (GUI) generated . 
by the host server 102. 

The host server 102 may provide probability distribution information on 
the network 104 in the form of web pages and allow the individual user 
108, the financial institution 112, the third party web server 1 14, and the 
25 media operator 1 24 to view the information freely. The host company that 
runs the host server 102 may generate revenue by, for example, selling 
advertisement space on its web pages to an advertisement provider 110. 
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The host server 1 02 may also provide proprietary information and 
enhanced services to individual users 108, financial institutions 1 12, third 
party web servers 1 14, and media operators 122 for a subscription fee^ 

The host se rver 1 02 may have a direct link to the financial institutions 1 1 2 
5 to provide tailored information in a format that can be readily incorporated 
into the databases of the financial institutions 1 1 2. Financial institutions 
112 may include, for example, investment banks, stock brokerage firms, 
mutual fund providers, bank trust departments, investment advisers, and 
venture capital investment firms. These institutions may incorporate the 
10 probability distribution information generated by the analytical software 
118 into the financial services that they provide to their own subscribers. 
The probability distribution information provided by the host server 1 02 
enables the stock brokerage firms to provide better advice to their 
customers. 

15 A third party web server 114 may incorporate probability distribution 
information into its web site. The infomaation may be delivered in the 
form of an information feed to the third party host of web server 1 14 either 
through the Internet or through a dedicated or dial-up connection. 

Figure 1 0 shows an example of a data feed 1 82 sent fi:om the host server 
20 102 to the third party web server 1 14 through conununication link 128. 
Data feed 1 82 carries messages 1 84 that include header information 1 86, 
identifying the sender and receiver, and records 1 88 that relate to specific 
underlying assets. 

Each record 188 includes an item 190 that identifies a future date, a 
25 symbol 192 identifying the asset, risk-neutral probability density 

information 193 and cumulative distribution information 194. The record 
could also include a symbol identifying a second asset 195 with respect to 
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the identified future date, and so on. Other infonnation could be provided 
such as a risk premium value with respect to the risk-neutral values. 

Examples of third party web servers 114 are the web servers of 
E*TRADE, CBS M arketWatch. Fidelity Investments, and The Wall Street 
5 Journal. The third party web server 114 specifies a list of assets for which 
it needs probability distribution information. Host server 1 02 periodically 
gathers information fi*om financial information provider 106 and its own 
financial database 120, generates the probability distribution information 
for the specified list of assets, and transmits the information to the third 
10 party web server 1 14 for incorporation into its web pages. 

Examples of the media operator 124 are cable TV operators and 
newspaper agencies that provide financial information. For example, a 
cable TV channel that provides stock price quotes may also provide 
probability distribution information generated by the host server 102. A 

1 5 cable TV operator may have a database that stores the probability 

distributions of all the stocks that are listed on the NYSE for a mmiber of 
months into the fiiture. The host server 102 may periodicaUy send updated 
information to the database of the cable TV operator. When a subscriber 
of the cable TV channel views the stock price quotes on a TV, the 

20 subscriber may send conunands to a server computer to the cable TV 
operator via modem to specify a'particular stock and a particular future 
date. In response, the servo: computer of the cable TV operator retrieves 
the probability distribution information from its database and sends the 
infoimation to the subscriber via the cable network, e.g., by encoding the 

25 probability distribution infonnation in the vertical blank interval of the TV 
signal. 
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Likewise, a newspaper agency that provides daily transaction price quotes 
may also provide the probabilities of stock prices rising above certain 
percentages of the current asset prices at a predetemuned future date, e.g., 
6 months. A sample listing on a newspaper may be "AMD 83 88 85 

^5 A40%'Vmeamngthattl^^A^5lD'stotk^Ka^ 

price of $88, a closing price of $85 that is higher than the previous closing 
price, and a 40% probability of rising 1 0% in 6. months. 

The analytical software 118 may be written in any computer language 
such as Java, C, C++, or FORTRAN. The software may include the 

10 following modules: (1) input module for preprocessing data received from 
the financial data sources; (2) computation module for performing the 
mathematical analyses; (3) user interface module for generating a 
graphical interface to receive inputs from the user and to display charts 
and graphs of the computation results; and (4) commimications interface 

1 5 module for handling the communications protocols required for accessing 
the networks. 

Web pages and user interfaces 

A variety of web pages and user interfaces can be used to convey the 
information generated by the techniques described above. 

20 . For example, referring to figure 5, a GUI 700 enables a user 108 to obtain 
a range of financial services provided by the host server 102. The user 108 
may see the implied probabilities of fixture prices of maricetable assets 706 
having symbols 704 and current prices 708. The information displayed 
could include the probabilities 714 (or 718) of the asset prices rising above 

25 a certain specified percentage 712 (or falling below a certain specified 
percentage 716) of the reference price 710 within a specified period of 
time 720. 
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For the convenience of the user 108, GUI 700 includes links 730 to 
institutions that facilitate trading of the assets. The host company that 
runs the host server 102 sells advertising space 728 on the GUI 700 to 
obtain revenue. The GUI 700 also has links 726 to other services provided 

^5 ^bythe'hostserverl027including providing 

management, on-line courses on topics related to trading of marketable 
assets, research on maiket conditions related to marketable assets, and 
management of portfolios of assets. 

Referring to figure 6, the GUI 700 also may display an interactive web 
10 page to allow the user 1 08 to view the markers current prediction of future 
values of portfolios of assets. The past market price 734 and current 
market price 736 of the asset portfolios 732 are displayed. Also displayed 
is the price difference 738. The GUI 700 displays the probability 744 (or 
746) that the portfolio 732 wiU gain (or lose) a certain percentage 740 
15 within a specified time period 742. Examples of portfolios include stock 
portfolios, retirement 401K plans, and individual retirement accounts. 
Links 748 are provided to allow the user 108 to view the market's current 
forecast of future price trends of the individual assets within each 
portfolio. 

20 Referring to figure 7, in another user interface, the GUI 700 displays an 
interactive web page that includes detailed analyses of past price history 
' and the market's current forecast of the probability distribution of the 
future values of a marketable asset over a specified period of time^ The 
GUI 700 includes price-spread displays 750 rq)resenting the cumulative 

25 distribution values of the predicted fixture prices of an asset over periods of 
time. The price-spread display 750a shows the price distribution data that 
was generated at a time three months earlier. A three-month history of the 
actual asset prices is shown as a line graph for comparison to give the liser 
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108 a measure of the merit of the price distribution information. The price- 
spread display 7S0b represents the predicted cumulative distribution 
values of the asset prices over a period of one month into the future. The 
left edge of display 750b, of course, begins at the actual price of the asset 

— 5^ aSTof thrmd^f the^tt^thrfee-m^^^ e.g., the cuirenf DELITstoclT 

price of $50. The probability distribution information implies, for 
example, a 1 % probability that the stock price will fall below $35, and a 
99% probably that the stock price will fall below $80 in one month. GUI 
700 includes table 752 that shows highlights of asset information and 
10 graph 754 that shows sector risks of the asset. A box 755 permits a user to 
- enter a target price and table 757 presents the probability of that price at . 
four different fiiture times, based on the calculated implied probability 
distributions. 

Referring to figure 8, in another approach, a window 402 is displayed on a 
15 user's screen showing financial information along with two other windows 
408 and 410 showing probability distribution information. The individual 
user 108 could have previously downloaded a client program from the 
host server 1 02. When the user is viewing any document, e.g., any web 
page (whether of the host server 102 or of another host's server), the user 
20 may highlight a stock symbol 404 using a pointer 406 and type a 

predetermined keystroke (e.g., "ALT-SHIFT-Q") to invoke the client 
program. The client program then sends the stock symbol as highlighted 
by the user to the host servo* 102. The host server 102 sends probability 
distribution information back to the client program, which in turn displays 
25 the information in separate windows 408 and 410. 

When the client program is invoked, a window 422 may be displayed 
showing the different types of price information that can be displayed. In 
the example shown, the "Probability distribution curve'* and *TJpper/lower 
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estimate curves" are selected. Window 408 shows the price range of AMD 
stock above and below a strike price of $140 from July to December, with 
90% probability that the stock price will fall between the upper and lower 
estimate curves. Window 410 shows the probability density curve^x) for 
— 5=--— A\fDstockfora^foture-date-of8/}5y2000r^^^^ ^ 
default function curve, such that whenever an asset name is highlighted, 
the default function curve is displayed without any further instruction from 
the user. 

Tabular data such as those shown in TABLE 1 may be generated by the 
10 host server 102 and transmitted over the network 104 to devices that have 
limited capability for displaying graphical data. As an example, the 
individual user 1 08 may wish to access asset probabiUty distribution 
information using a portable phone. The user enters commands using the 
phone keypad to specify a stock, a price, and a future date. In response, the 
15 host server 102 returns the probabiUty of the stock reaching the specified 
price at the specified future date in tabular format suitable for display on 
the portable phone screen. 

Referring to figure 9, a portable phone 500 inchides a display screen 502, 
numeric keys 506, and scrolling keys 504. A user may enter commands 
20 using the numeric keys 506. Price information received from the host 

server 102 is displayed on the display screen 502. Tabular data typically 
includes a long Hst of numbers, and the user may use the scroll keys 504 to 
view different portions of the tabular data. 

In the example shown in display screen 5(X2, the AMD stock has a current 
25 price of $82. The cumulative distribution values F(x) for various future 

prices on 8/15/2000 are listed. The distribution indicates a 40% probability 
that the stock price will be below $80 implying a 60% probability of the 
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stock price being above $80. Likewise, the distribution indicate an 80% 
probability that the stock price will be at least $90, implying a probabihty 
of 20% of the stock price being above $90. 

The visuali zation tech niques.discussed below are usefiil in enahling-iiKers 
5 to visualize and quickly understand information that relates to assets. 

Visualization of implied probability distributions of future prices 

As shown in figure 11, a visualization device 10 displays cumulative 
probability distribution values of predicted relative future prices of Dell 
Computer Corporation stock with respect to a current date 1 2 of July 1 , 

10 2000. The price 14 on July 1, 2000, is shown as being $41 lower than the 
price 16 on February 1, 2000, which itself is set at an arbitrary starting 
value of $0 for purposes of display. The display could be provide in actual 
price terms, as a price change, or in terms of percentage return. The 
probability distribution data on which the visualization device 10 is based 

1 5 may be generated by, for example, the method discussed in the parent 
patent application. . 

The predicted cumulative distribution values of the prices of Dell stock 
over a period of several months into the future are illustrated by an 
envelope 16 that begins at a point 1 8 and opens to the right 

20 . The envelope 16 is divided into stripes 22, 24, 26, 28, 30, each of which 
also begins at point 18 and opens to the right. Stripe 22, for example, 
indicates a range of prices (all of which are below the current price) at 
each date in the future and indicates the predicted odds (10%) that the 
price will fall within that stripe. Siniilarly, stripe 26 indicates a range of 

25 prices (above and below the current price) with an expected 40% odds of 
occurring on various dates in the future. The odds of falling either above 
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or below envelope 16 are, as indicated, less than 1%. Each stripe is 
displayed in a different color, and the colors are chosen to permit a viewer 
to visualize the different stripes easily. 

A similar envelope 32 starts at the nominal $Q price on February 1 ,2000, 

5 and ends on the current date. Envelope 32 represents the cimiulative 
distribution values of the prices of Dell stock that were predicted as of 
February 1 , 2001 . The actual price history of Dell stock between February 
1 , 2000, and the current date is illustrated by the line 34. The extent to 
which the actual price history of line 34 matches the predicted cumulative 
10 distribution values gives a visual indicator to the user of the validity of the 
prediction approach. 

The combination of color, text, and data illustrated in figure 1 1 enable an 
investor to assess the performance of an asset over time relative to his 
price expectations. 

15 The visualization device of figure 1 1 is also useful for assets other than 
stocks, including mutual funds, and for portfolios of assets. 

Figure 12 presents information similar to figure 1 1, but is expressed with 
respect to projected return percentage rather than price. The example 
shown in figure 12 relates to Check Point Software Technologies Ltd 
20 stock as of a current date 66 of October 24, 2000. The x-axis represents . 
return percentage witii respect to a start date. Line 62 shows the historical 
retum with respect to the stock price on the start date of January 1, 2000 at 
point 67. On the current date 66, the cumulative retum on the price of the 
stock since start point 67 is ^proximately 200%. 

25 An envelope 68 starts at point 66 and opens to the right. The envelope 68 
illustrates the projected odds of the percentage retum being within certain 
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ranges on each day for several months into the future relative to the 
original start point 66. The ranges are expressed as stripes 52, 54, 56, 58, 
and 60. The envelope and stripes are centered on a trend line 50 that has a 
slightly positive slope to reflect the probability of future price levels 
^5 generat^'bya'matlMHatiM'llpntEirtMtlFBs^^ ^ 
volatility of the options market. The algorithm is described in the related 
pending United States patent ^plication 09/641,589, filed 08/18/2000. 

For example, the projected odds that the retum (relative to the start point 
67) will be between 50 and 1 00% on May 1 , 2000, is 1 0%. 

10 - - The same kind of data used to generate the display of figure 1 1 is used to 
generate the device of figure 12 except that the data is processed to 
convert the price data into change of price data for plotting along the x- 
axis. 

Figure 13 is similar to figure 12, except that the effect of the occurrence of 
15 the long-term capital gain tax rate transition (identified as the vertical line 
80 that is one year after the start date 82). After the date represented by 
line 80, any sale of the stock would produce a lower tax impact and a 
higher effective return, than under the assumption of short-term capital 
gain tax rate, prior to that date. For that reason, the envelope 84 is shifted 
20 upward and exploded for periods after the transition date. 

Visualization of asset style 

Figure 14 shows another visualization device that reflects an asset fimd 
style analysis that evaluates an asset fund (e.g., a mutual fund) by 
comparing its historical returns to those achieved by a set of basic asset 
25 classes (e.g., cash, bonds, large-cap growth stocks, large-cap value stocks). 
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The first step of the style analysis is a one-time selection of basic asset 
classes, which should be mutually exclusive and exhaustive, to represent 
aU asset types of interest. In one example of classes (listed below) there 
are seventeen market indices, seven of which represent stocks and the 
TonaindCT^fwluch'representbtods. =— 

The second step of style analysis determines the exposure of a given 
mutual fund to these indices. This is achieved by solving an asset class 
factor model, in which a fiind return is expressed as a linear combination 
of returns j&om basic asset classes plus a residual. The exposures are 
determined by minimizing the variance of residuals using one-year weekly 
data. It is beUeved that one-year weekly data can reflect a fund style more 
accurately. In addition, fund exposures to basic asset classes are 
constrained to be non-negative and to sum to one. 

The third step of style analysis is to present the results in a fonn that 
provides meaning&l investment information. Style analysis results for a 
given fiind consist of percentages in each basic asset class, with the 
dominant percentages determining the fund's style. Style drift for a given 
fund is based on determining style changes over the most recent five 
years. 

In figure 14, the results of the analysis are displayed. The colors of the 
respective cells 102 indicate how much of the fimd's performance is 
explained by regression to the style associated with the row in which the 
cell appears, during the period represented by the colunm in which the pell 
appears. 

The example shown in figure 14 identifies each of seventeen indices 
(styles 100) that are of interest to a broad group of individual investors. 
For example, the style LG refers to a set of stocks that are characterized as 
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Large Capitalization Growth. The fiill list of groups in this example 
follows: 

1 . Large-Cap Growth (LG) 



2: tarp^C^ValuelpLV) 

5 3- Mid-Cap (MC) 

4. Small-Cap (SC) 

5. European Stocks (EU) 

6. J^anese Stocks (JP) 

7. Emerging Markets (EM) 
10 8. Cash(TB) 

9. Intermediate Government Bonds (GI) 

10. Long-term Government Bonds (GL) 

11. Intermediate Corporate Bonds (CI) 

12. Long-term Corporate Bonds (CL) 
15 13. Corp Jiink Bonds (HY) 

14. Mortgages-Backed Securities (MG) 

15. Real Estate (RE) 

16. Municipal Bonds (MU) 

17. Global Bonds (GG) 
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Thus, for cell 102, the regression indicates that about 45% of the fund's 
performance is correlated to the LG style for that period in 2000. 

The values determined by the regression are displayed in a grid with style 
on the vertical axis and time on t he horizon tal axis, jrhexolor^of.each.cell 
5 102 indicates the percentage in accordance with the percentage scale 
shown on the right. 

The resulting visualization device enables an investor to assess the 
performance of the asset over tinie relative to his investment preferences 
and strategy. 

10 Visualization of recent market activity 

The ability to track the activity of a market of assets (such as stocks or 
mutual fimds) as the activity unfolds is of great interest to investors. Many 
investors rely on daily publications of tabular data that presents 
information such as volume, price change, asset identification, and 
15 performance. 

The visualization device shown in figure 15 collects, condenses, and 
enhances such information in a way that improves the ability of an 
investor to visually and quickly grasp recent and current market activity. 

The displays are updated continually and quickly throughout a trading 
20 day. 

As shown in figure 15, the visualization device 120 includes a radar-Uke 
display that is divided into sectors 122 arranged around a central point 
124. The device is also divided into rings 126 that are centered on point 
124 and filled with different colors to distinguish the different rings 
25 visually. 
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Each of the sectors 122 is associated with an industry or sector of interest 
to investors, for example the technology sector or the financial sector. The 
size of each sector depends on the proportion of the asset items being 
displayed for the sector relative to the total number of asset items being 
dq>icted for the whole universe. 

Each of the rings represents a different percentage of price change during 
a recent period (e.g., during a single trading day). The rings are arranged 
with the largest percentage decline near the middle of the radar and the 
largest percentage increase near the periphery. 

Within each of the sectors, small dots 128 are displayed each representing 
a selected stock or asset within the industry sector represented by the radar 
sector. The distance of each dot from the central point 124 represents the 
percentage price change of the corresponding stock at a given time during 
a trading day. Gray dots represent small c^italization stocks; black dots 
represent large capitalization stocks. 

When multiple stocks in a sector have the same percentage change (e.g., at 
location 130), the dot^ are displayed at different angular positions relative 
to the central point, to convey to the viewer an impression of the 
distribution of the percentage changes within each sector. 

Implementation details 

The visualization elements described above can be displayed on a wide 
range of devices, including desktop and laptop computers, personal digital 
assistants, portable telephones, publicly viewed large-screen displays, or 
closed circuit or broadcast/cable television monitors. 

The visuaUzation elements can be displayed alone or embedded in other 
displayed material, including other financial information, general news 
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information, or program material. For example, the elements can be 
displayed as part of a website page dedicated to financial information or as 
part of a general web portal page. The elements can be displayed as part of 
a broadcast or cable TV program. 

5 The raw data fi^m which the visualization elements are created may be 
obtained on-the-fly electronically and/or may be stored as needed either 
locally or centrally. Software that processes the raw data to generate the 
derived values to be represented in the visualization elements may run 
locally or may be run remotely (and then downloaded to a local display). 
10 Software that processes the derived values to produce the visualization 
elements may be handled similarly. 

The raw data, the derived values, and the visualization elements can be 
iq)dated more or less frequently, though in many cases real-time updates 
are especially useful. 

1 5 Each of the visualization elements could be made interactive by enabling a 
user to provide inputs, for example, mouse cUcks, that indicate how the 
user wishes to alter the manner in which the elements are displayed, or the 
selection of data contained in them. Configuration features can be 
provided to enable the user to configure what information he receives, in 

20 what form it is displayed, and how often and how currently he receives it. 

Other implementations are within the scope of the following claims. 

For example, with respect to the visuahzation element shown in figure 15, 
the overall shape of the element could be other than round, the sectors 
could be other than simple pie shapes, the rings could be other than simple 
25 rings, the individual dots could be replaced by other icons, the dots or 
other icons could be arrayed in other arrangements from the center, and 
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visible features other than color could be used to distinguish different 
portions of the display. 

A wide range of variants is also possible with respect to the visualization 
elements shown in the figures. 
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Appendix A 

Three aspects of the mvention are: 

1. The recognition of the desirabiUty of displaying to a financial investment customer in real 
time, for example on a World Wide Web site, the probability distribution governing the 
price of a particular asset {e.g., a stodc) at a selected future time. 

"^2^-^=he-reGog?iition-that^such=probabihty"distributionsxan^be"derived"from"opti 
that asset, or for related assets, which are readily available in real timer 

3. The recognition that probabihty distributions involving several asset prices simultaneously 
are useful to investment ciistomers in several contexts, especially in exploring hypothetical 
scenarios, and that single asset distributions such as (but not restricted to) the above can . 
be meaningfully incorporated into multivariate distributions, manageably determined. 

In this appendix we first describe a basic method for deriving probability distributions for 
single assets from option prices. We next describe improvements on this basic method to address 
various practical issues. Then we take up the multivariate case and show how to extend this 
kind of single asset price distribution, or any other, to the multivariate case. Finally, we consider 
a number of novel multivariate appHcations, with emphasis on scenario exploration. 

1 Basic method 

A call option is an option to buy an asset (e.^,, a stock) at a certain price x (called the strike price) 
on a given expiration date T days in the futine. (An option exercisable only on the expiration 
date is called a European-style option; for simpfidty we will consider in this discussion only this 
type of option.^) Similarly, a put option is ah option to sell an asset at a strike price x on a 
given expiration date. (The "European-style" assumption of no possible early exercise is more 
important here, but can also be ignored for puts that are not too deeply "^n the money.") 

Let c{x) denote the price of a call option on an asset at strike price and p{x) the price of a 
put option. Such prices are established by options market-makers. We have realized that sudi 
prices impUcitly contain information about a "market view'' of the probability distribution of 
the price of that asset at the expiration date. 

In a simple but precise form, this market view can be stated as follows. Suppose that we were 
given the call price curve c{x) or the put price curve p{x) as a continuous function of the strike 
price X for all X > 0. Then, the second derivative of either the call or the put price curve is the 
market view of the risk-neutral probabihty density function (pdf) /(x) of the asset price at the 
expiration date. In other words, f{x) = c"(x) = j/'(ar). 

The idea that option prices determine some kind of imphed probability distribution is fairly 
well known in the financial Uterature. The idea that a pdf can be computed by taking the 
second derivative of a continuous option price curve is known in the academic literature, but it 
does not appear to be very well known. For example, the standard textbook "Options, FVitures, 

^Even allowing for possible early exercise, most liquidly traded call options without large dividends can be 
treated as if there were no possibility of such exercise, since sale of the option is usually a better alternative; 
therefore, these call options behave similarly to European-style options. 
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and Other Derivatives," by John C. HuU (Fourth Edition, 1999; Prentice-HaU) mentions implied 
probabiUties, but not the second-derivative method. The best reference that we have been able 
to find is J. C. Jadcwerth and M. Rubinstein, "Recovering probability distributions from option 
prices," J. Finance, vol. 51, pp. 1611-1631 (1996), which has only six prior references. 

The risk-neutral distribution (at a fixed future tune T, for a fixed asset) is defined as the price 
distribution that would hold if market participants were neutral to risk, whidi they generally 
are not. However, many asset pricing theories, such as those underlymg Black-Scholes option 
thBory an d most of the variations found in the Hull book above. aUow for the true risk-averse 
asset prire distribution to be obtained from the risk-neutral diitributioii f{x)-yost'hy adjiisting 
the latter by an appropriate risk premium: K there are no dividends, the traeTiistribution is just 
/(let''-'")^), where /* - r is the expected annual return rate for the stock in excess of the risk 
free rate r. We use a variation on this simple format, slightly modified to aUow for dividends (see 
below), though our invention could also work weU with a more complicated adjustment. In this 
format, a value for ^ - r must still be supplied. We use as a default the "consensus estimated 
taken from the textbook "Active PortfoUo Management" (1995) by Grinold and Kahn. These 
authors note a long-term average value of the risk premium to be 6% per year, and su^ 
multiplying this number by the stock's beta to get - r. The parameter beta is the stope of the 
line giving a regression of the stock in question against a market portfolio, often taken as th| 
S&P 500. This is the weU-known CAPM estimate for the expected excess return. Whether good 
or bad, its stature as a consensus estimate makes it suited to our aim of providing a market 
view, though it is only a default. Our invention, which provides the risk-neutral component 
of the probabihties, could work with other estunates for the risk-averse adjustment parameter 
u - r and with any expficit scheme for adjusting the risk neutral probabiHty density to the 
risk-averse probabiHty density. It is worth pointing out that, for shorter time penods-even a 
month or two-the risk adjustment required is small and generally overwhehned by fluctuations 
in the risk-neutral distribution itself. 

We give a brief proof that the second derivative procedure gives the correct risk-neutral prob- 
ability distribution. As m Hull, we may calculate the European caU or put price as an expected 
vEdue in the risk-neutral distribution. 

If the actual value of the asset on tfe expiration date is v, then the value of a call option at 
strike price x max{t; - x,0}, and the value of a put option is max{x - «,0}. If the actual 
value is a random variable with pdf /(v), then the expected value of a call option at x at the 
expiration date is 

cr(i) = E„[max{« - 1, 0}j = j[ (" " '^"^ 
and the expected value of a put option at x at the expiration date is 

pr(i) = E„[max{i-u,0}l = / {x-v)f{v) dv. 

Jo 

The current values c{x) and p{x) may be obtained by discounting Dr(x) and prix) by e'"^, 
where r is the risk-free mterest rate, but for our purposes, forecasting probability distnbutions 
at time T, we do no discounting, and henceforth just write c(x) = t>r(x), p{x) = pr(x). 
Parenthetically, from these expressions we observe that 

fCO 

p(x)-c(x)= / {x-v)f{v)dv:=-x-Ev[v\^x-8*, 
Jo 
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where s* = Bv[v] ia the expected value of the asset at the expiration date under the risk-neutral 
distribution. (If there are no dividends, then s* = se^^\ if there are dividends, then in general 
it is necessary to subtract from se^'^ the value at time T of the dividends.) This well-known 
relation is called put-call parity; it shows why either price curve carries the same information. 
PYom the above expression for c(a:), it follows that its first derivative is 

d'(x) = - /(a:) dr = F(a;) - 1, 

where F{x) = Jq J{v) dx is the cumulative distribution function (cdf) of the random variable 
v. To prove this, note that v-x= J^dz. Therefore 

c{x) = / (v - x)f{v) dv = r dv r dz f{v) = r dz r dv f{v) = r dz (i - f{z)), 

Jx Jx Jx Jx Jz Jx 

where we interchange the variables z to integrate over the two-dimensional region % = {iy^z) : 
X < z< v}. The last expression implies that c'(x) = -(1 — F{x)), 

Rrom put-call parity, it foUows similarly that 

p'{x) = l+c!{x)=F{x). 

Since the cdf and pdf are related by F'(x) = /(x), these expressions in turn imply that the 
second derivative of either c{x) or p(x) is the pdf /(x): 

t/'(x)=p"(x) = F'(x) = /(x). 
The general character of the option price curves c(x) and p(x) is therefore as foUows: 

• For all X less than the minimum possible value of v (i.e., such that F{x) — 0), c(x) = 
Ev[vl — X = a* — X and p(x) = 0. In other words, c(x) is a straight line of slope —1 starting 
at c(0) = Ev[uJ = 3% while p(x) = 0. 

• For all x greater than the maximum possible value of v (i.e., such that F{x) = 1), c{x) = 0 
and p(x) = X — s*. In other wordis, p(x) is a straight line of slope 4-1 and x-intercept 
while c(x) == 0. 

• These two line segments are joined by a continuous convex U curve whose slope increases 
from -1 to 0 for c(x), and from 0 to +1 for p(x). 

We note that the fact that the mean Ev[v] of the pdf /(x) is 5*, the value in future dollars at 
time T of the underlying price s (less the value of any dividends), implies that option prices must 
be constantly adjusted to reflect changes in the underlying price 5, even if there is no market 
activity in the options. 

The fact that s* ^ ^v[v] &lso implies that an option price curve can make no prediction about 
the general direction of the underlying price s. However, the option price curve does predict the 
shape of the pdf /(x), and in particular its volatility. 
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l-l i^ifaxixiiratians based an a finite subset of bid-asked opfcian prices 

In practice, option prices c{x) and p{x) are quoted only for a finite subset of equally-spaced 
strike prices x, namely x = nA for integer n and spacing A. We denote c(7iA) and p(nA) by Cn 
^4Pn> respectively. Moreover, quotes specify only a bid-asked spread, not exact prices. In this 
subsection we give methods for dealing with these problems. (Most of the Jackwerth-Rubinstein 
paper (op. cit) is concerned with these kinds of curve-fitting problems.) 

The.first.deriyatiYes=d'=(x)=and=p^ic)=at^x===(n+-^-)A-may-be^^^^ 

'^nH=— A ' A • 

The corresponding estimates of the cdf F^^^^ = F((n + j)A) are 

Thus, using both bid and ask prices for both European-style puts and calls, one can compute 
four diflPerent estimates for the cdf F^^ i , which can then be combined into a single estimate. 

This combination will preferably take into account whether x = (n + j)A is much less than 
the underlying price s ("deep out-of-the-money*' ), near s ("near the money), or much greater 
than s ("deep in-the-money"), according to the different patterns of setting bid-asked spreads in 
these different ranges. Another consideration is avoiding quotes near prices where early exercise 
is likely, such as deep in-the-money puts. 

Similarly, the second derivatives cf'{x) and p^^{x) at x = nA may be estimated by the first 
differences of the estimates of the first derivatives; e.g., 

A" - A^ 

We may take cf^n P^^n some combination as above as our estimate fn of the pdf /(nA). 

Note that since f{x) > 0, option prices should satisfy a convexity condition, eg,, Cn^i — 2c„ + 
Ca-i > 0 for , call option prices. Indeed, violation of this condition would allow making money 
via a risk-firee "butterfly straddle" involving buying one call option at (n 4- 1)A and another at 
(n — 1)A, and selling two call options at nA- A similar result holds for put options. 

1.2 IDynaaic esUmaobes 

The methods considered in the previous subsection allow esthoaation of the cdf and pdf at a 
subset of A-spaced values of x, based on a static set of option quotes at a particular time. 

As previoiisly noted ^ however, option prices must change continually in response to changes 
in the underlying price s. Let s* denote the corresponding forward price at expiration (the 
price 3 evaluated with interest). Suppose this price (measured in dollars at expiration) moves 
up (or down) by a small amount, an increment e in its logarithm, say, with little or no change 
in volatility. Here e may be viewed as, approximately, the percentage move 6/s* caused by a 
move of 5 in the (forward) stock price. We expect in this situation that (forward) probability 
distribution for the stock price will just be shifted by € in the log domain. That is, the distribution 
will appear to be identical there, except with a mean shifted by e. Thus, the value of the new cdf 
at X = e^°* is F(e^^°^^')) = F(x/a), where F denotes the original cdf with distribution mean 
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5% aad a e^ A reasonable call price functional equation that gives the same effect, upon 
differentiation, is 

ac{3*, x/a) — c(a5', x), 

where c{s*,x) denotes the price, in dollars at expiration, for a call option at strike x when the 
underlying is at price s. Note in this equation that all other variables, such as volatihty, axe 
assumed to be the same, which will only be approximately true, even for very small values of c. 
But, assuming this approximation, we can think of an option price at strike x, measured when 
— the-(forward)-price-has-mwed--to^*n 
option at strike x/a, but corresponding to the current underlying price s!. Considering all 
the strikes at which options are frequently quoted, and thinking additively, we can effectively 
observe c{x) (and p{x)) for a different subset of approximately equally-spaced strike prices, 
roughly x = nA - i for various values of 5 = es*. Some care must, of course, be taken to ensure 
simultaneity of prices, of option and underlying. For this reason, we may prefer to consider the 
values of nA (correspondmg to the various standard strike values) separately/and synchronize 
observed time of sales for an option at a given strike with the underlyling security. Implied 
volatilities (discussed below) could be monitored, to ensure their changes relative to e were 
small. 

Using a similar, technique to that described in the above paragraphs, meaningful average option 
prices for a given strike can also be computed, using thin strike intervals and using either short 
time intervals or tune series methods (time averages weighting the present more than the past). 
Note that, without the framework described in this subsection, the computation of •*average" 
option prices at a given strike are problematic when the stock price varies in the period over 
which the average is taken. 

To summarize: Given enough movements of the underlying price, we can effectively observe 
prices and compute estimates as above for a much more finely quantized subset of strike prices 
X, and provide a framework for improving accuracy through averaging methods. 

2 Methods for extriapolation and smoothing 

There are two major limitations to the basic methods of the previous section. One is that option 
quotes are available only for certain expiration dates. Another, not so obvious, is that option 
quotes are rehable primarily for options in which there is substantial market activity. These 
would typically be nearer-term options at strike prices near the money (the underlying price). 

To extend our prediction methods to times other than expiration dates and over wider ranges 
of strike prices (and also to help reduce **noise" in our displays), we use extrapolation and 
smoothing techniques. We have found that it is advantageous to do extrapolation and smoothmg 
in the volatility domain. 

There are many reasons for this advantage. For example, option practitioners are well aware 
of the kinds of shapes that the volatility curves (sometimes called Volatility smiles") have had 
historically, in various markets, and how these curves vary with time; this can be a guide to 
imposing structure on the smoothing curves to prevent overfitting of possible artifacts. Many 
records have been kept of the volatilities impHed by option prices, and it is easy to examme how 
in the past they have changed with respect to price behavior. For example, the CJhicago Board of 
Options Exchange makes public its average near-the-money volatihty index (now called Vix) for 
S&P 100 options back to 1986. Fhially, it is easier to work visually with volatility curyes, which 
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would theoretically be flat if f{x) were lognormal, than with visual differences in near-iognonnal 
pdfis, which can all look very much alike. Mathematicallvj model improvements can be made 
m the volatility domain just by changing coefficients of low-degree polynomiaJ apprcodmations, 
even though these affect higher-order terms in power series for the corresponding cdfis or pdft- 

The following subsections explain more precisely how to work in the volatility domain. 
2L1 LpgnonTEd pdfe 



The standard Black-Scholes theory of option pricing (see Hull> op. cit.) yields a lognormal pdf 
f{v) whose expected value is Ey[v] - s*, such that Inv is a Gaussian (normal) irandom variable 
with variance a^T, where the parameter a is called the volatility rate of the asset, and T is 
the time to expiration. By a standard property of lognormal distributions, this implies that the 
mean of hi u is Et,[ln v] = In 5* - 

fVom this pdf follows the famous Black-Scholes call option pricing formula [Hull, Appx. IIA]: 

c(x) = E„[max{i; - x, 0}] = s*N{di (x)) - xN{d2 (x)), ' 

where N{di{x)) and N{d2{x)) are values of the cumulative distributioii function of a Gaussian 
random variable of mean zero and variance 1 at the points 

hi(57x)-ha^r/2 .^^^ 
di{x) = ' = d2{x) + cry/T\ 

. ]n(s*/x)-o^T/2 E^flnvl-lnx 
'^(^^ ^ ^ = ay/f ' 

(Recall that our version of the call price is not discoimted, and is given in dollars at time T, and 
that s* is today's stock price, valued in dollars at time T, less the value of any dividends.) Note 
that aVT is the standard deviation of Inv; therefore —^2(2^) is just Inx, measured in standard 
deviations from the mean EvllnvJ. 

Similarly, by put-call parity, we have the Black-Scholes put option pricing formula 

pix) = cix) + X - 5* = 3*{N{di{x)) - 1) - x{N{d2{x)) - 1) = xN{-d2(x)) - s'-Ni-diix)). 

1 

Taking the derivative with respect to x, and using s*iV'(di (x)) = xN\d2 (x)) and (x) = 
(the latter equation holding under the assumption of constant volatility, which we will later 
drop), we obtain 

F(x) = e{x) ^r 1 = ~N{&i[x)) + 1 - N{-d2[x)). 

Now F(x) is the probabiUty that 1; < x, which is equal to the probabiHty that Inv < Inx, which 
since Inv is Gaussian with mean E^pnt;) and standard deviation a>/r is given by 

F(x) = Fi{o < x} = Pr{hit; < hix} - iV ^I^^-IMEJI^ ^ N{^d2{x)). 

Thus we have verified that the Black-Scholes pricing formulas give the correct cdf F{x). The 
derivative of F(x) will thus yield the correct lognormal distribution /(a:)= F'(a:). 
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2.2 Oiaractezizatimof @en^ 

Now let F[x) be an arbitrary cdf on R+; i.e., a function that monotonicaily increases from 0 to 
1 as X goes from 0 to infinity. For simplicity we will assume that F{x) is strictly monotonicaily 
increasing; i.e., f{x) == F'{x) > 0 everywhere. Then there exists a continuous one-to-one 
"warping function" y : R^. IR such that F{x) = N{y{x)) everywhere; t.e., such that the 
probability that a random variable v with cdf F{x) will satisfy u < x is equal to the probability 
, tbat.ajstaiidar dJGaussian random variable n with mean zero and variance 1 will satisfy n < y{x). 
Similarly, there is an inverse warping function x{y) such that F{x(^)) = N{y). " 

Given the warping function i/(x), the cdf F{x) may be retrieved from the relation F{x) = 
N{y{x)). Therefore the cdf F(i) completely specifies the warping function y(x), and vice versa; 
i.e.f both curves carry the same information. 

If F{x) is the cdf of a lognormal variable v such that Inv has meaji Evflnv] = Ins* — 
and variance Oy = o^T, as in the previous subsection, then the warping function is given by 

■ . lna:-(hia*^(r2T/2) lnx-Ev[hiv) 
y(x) = -d2{x} = = — ■ — . 

' For this reason we may sometimes write y{x) as — ^2(1), even when the cdf is not lognormal so 
that the right-hand equation above for d2{x) does not hold. 

2.3 ^JaqpaGd^X3iakSme3 

If /(x) is not lognormal, then the Black-Scholes pricing formulas do not hold. Nonetheless, given 

an option price c{x) or p(x), it is common practice to define the implied volatility a(x) as thb 
value of a such that the Black-Scholes pricing formula holds, for a given x, s and T. 

The implied volatility curve a(x) so defined is a function of the strike price x, which is constant 
if and only if the pdf /(x) is actually lognormal. In practice, it is typically a convex U curve, 
called a "volatility smile." See, e.g., Hull, Chapter 17. 

FVom Subsection 2.1, we can see that there is a second method of calculating implied volatihties, 
as follows. Suppose that we have an estimate of the cdf F(x). Define the cdf-implied volatility 
t7i(x) as the value of cr such that the Biack-Scholes cdf formula jP(x) = iNr(-d2(x,cr,r)) holds, 
for a given x, 5 and T. 

The first method has the advantages of being defined directly from raw price data, and of being 
well understood in the financial community. However, the second method has the following 
advantages: 

1. It is easier to calculate, at least from estimates of F{x)\ 

2. It gives a simpler and arguably more intuitive relationship between volatihty and the cdf 
F(x). If we use the traditional implied volatihty <7(x), then the relationship is instead 

Br 

f(x) = JV(-d2(a;)) + g^'(x). 



3. It fits better with the multivariate theory to be developed below. 
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We have observed that the two curves a{x) and ai (x) seem to be fairly similar, at least as to 
the direction of their slope, and are generally hot too far apart in value "near the money"' . Also 
ai (x) = a{x) whenever a{x) has zero slope, though a\ (x) is a little smaller than a{x) when the 
slope a^{x) is negative (which often occurs for stocks). See the above equation. Finally, one 
function is as ad hoc as the other. Therefore, because of the above reasons, we generally prefer 
to use the cdf-implied volatility curve <ri (x). 

In any case, it is clear that either cr(x) or &i {x) contains the same information as any of the 
curves c( x), p(x), F(x ) or /(x V _ FVom_a_ (x^_or_gr(x)_we_caii_recover_c(g)^or=E(x)^using^^^ 
Bladc-Scholes call option pricing or cdf formula, and from this we can obtain all other curves. 

2l4 Exfjapoiatioa and snrxxitbing in the -volatility doomin 

The volatility curve <j{x) or ai{x) may be calculated pointwise from the corresponding curve 
c(x) or p(x) to give a set of values, at a finite subset of strike prices x. Each of these values may 
be deemed to have a certain degree of reliability. 

It is then a standard problem to fit a smoothed and extrapolated curve a{x) or ^\ (x) to these 
points, taking into account their relative reliabilities. Any standard smoothing and extrapolation 
method may be used. In general, the usual problems of avoiding overfitting or oversmoothing 
must be addressed. 

It is well-known that implied volatilities also vary with time. We generally wish to estimate 
curves a-(x, T) or ai (x, T) as replacements for the constant volatility a in the Black-Schol^ 
formulas, e.g., c(x) = c(x,cr,r) or F(x) = N(-d2{x,a,T)). 

In an especially meaningful example, we have experimented with a class of smoothing algo- 
rithms used in "ImpHed volatility functions: Empirical tests," by B. Dumas, J. Fleming and 
R. E. Whaley, J. Finance, vol. 53, pp. 2059-2106, Dec. 1998. These authors fit an implied 
volatility curve a(x), for the purpose of setting up a "strawman" option price model for testing 
(and defeating) a theory regarding the role of volatility in option pricing. Their "strawman" 
option pricing model c(x) was obtained by putting the resulting smoothed curve back into the 
Black-Scholes call formula. It is a "strawman" ad hoc model, because no intuitive notion of 
stock volatility could possible vary with strike price, which the stock never "sees." Nevertheless, 
their model performed admirably, surpassing in predictive power the highly regarded ^implied 
tree" method. One possible explanation offered was that their model mimicked in a smooth way 
interpolation methods actually employed by practitioners in the options markets. (See the dis- 
cussion of "Volatility matrices" in Hull, cited above.) Such an approach to option pricing seems 
ideal to us, because of its accuracy and because its underlying rationale represents a market 
view. Thus, we use the Dumas-Fleming- Whaley model for our own entirely different purpose, 
that of forecasting probability distributions. All that is necessary is to differentiate their call 
price model, which, conveniently for us, is a smooth function of strike price and other standard 
variables such as time, current stock price, and the risk-free rate of interest. The formula for 
the cdf F(x) is, as before, this derivative with 1 subtracted, or 

dc 

F(x) = iV(^d2(a:)) + £^'(x). 
We can make this very explicit. We have 
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where N'{z) denotes standard normal density, while or'(x) may be computed by differentiating 
the Dumas-Fleming- Whaley fitted volatility curve. The latter has the form 

a{XyT) =00 + aire + a^x^ + a^T^ a^T"^ + asxT, 

The coefficients {a^} are determined by regression. This kind of quadratic curve-fitting is easily 
implemented. Dumas-Fleming- Whaley impose a constraint to prevent their volatilities from 
going below 0 (or even below 0.01), and we have imposed further constreunts on extrapolations 
4^vhich.we often cany=out-beyond.the^range'Of4hdr-tests)pto»eiisure»tha 

below zero or above one. We have experimented with other variations on their basic approach, for 
example, using linear interpolation in the time domain, where we do not need to take deivatives. 
Our methods would, of course work, with any approach, possibly quite different, to volatility 
curve-fitting, though the general Dumas-Fleming- Whaley approiach has many things going for 
it: accuracy, confomaity to marketplace use of BlackTScholes, smoothness (differentiability, in 
particular), conformity to historical experience regarding the smile structure of volatility curves 
(especially important for extrapolation), and simplicity (which, beyond ease of implementation, 
helps avoid overfitting). These advantages are achieved in a probability context that was not 
considered in the paper where these volatility curves were introduced. 

3 The multivariate case 

The methods in the previous sections are capable of generating a display of raw or smoothed 
and extrapolated probability distributions for any optionable asset. Option prices ate quoted 
on a largie number of securities, as well as on certain indices, such as the S&P 500. 

However, an investor would also like to know future probability distributions for: 

• His or her entire portfolio; 

• Mutual funds; 

• A security without a quoted option; 

• A security in a hypothetical scenario. 

All of these questions involve considerations of several securities at once, and the probabilities 
of their simultaneous configuration of prices. This is clearly a consideration in the first two 
items above, but also enters in the third, where we would want to extract as much information 
as possible about the security without a quoted option price from those correlated with it that 
do have quoted options. Finally, in scenario analysis there wee many questions that involve 
considering the probabilities of several securiiy prices occurring at once, including changes in 
factors influencing the market that might be modeled by changes in a portfoHo of those securities 
most affected. We will take all of these issues up in the remainder of this document, but for now 
we just try to give a basic introduction. 

For a portfolio of securitiies, or a mutual fund, we are interested in a composite asset of the 
form 

X = h\xi -f + J" fhiXm 

where the are all assets for which we individually know the cdf F{xi) or the pdf f{xi). lb give 
our method the most flexibility, we do not reqiiire that this knowledge come from any particular 
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procedure, though we favor the approach of the preceding two sections. However, even for some 
securities or indices with a quoted option, we might not feel there was sufficient option activity 
to ji^tify a fuJl fit of a volatility curve, and might take a cruder substitute, even a flat straight 
line based on an average of avaUable implied volatiHties. In addition, it is convenient to aUonr 
the possibility that a few assets we are monitoring might not have any quoted option at all- this 
K easily accommodated by, say, using a flat volatility curve with a historical value for volatlHty. 
For testmg purposes and comparisons we might even want to consider a list of assets with all 
volatility curves given this wa y. In an y case our methodology herft is v^ry E».i»rffl. and wc onlv 



require that we know warping functions such that F{xi) = N(yi{xi)) for aU t. If the 

asset has an active options market, then the warping functions may be determined by either 
first estimating F(xi) directly from (finite differences of) options price data, as in subsection 
2.2, or by using the approach discussed later in Section 2 of extrapolating and smoothing in the 
volatility domain. In the latter case we have an exphcit form of the warping function yUx ) in 
terms of a fitted volatiUty curve ax{xuT) as yi{xi) = -di{xiM{xi,T),T), bhA this equittton' 
can also be used with any volatility curve with the assets above, that might have fewer or no 
traded options. In a later section we wiU discuss portfoUos in the logarithm domain, possibly 
containing long and short positions. One can think of warping them to standard normal directly, 
subtractmg the mean and dividing by the standard deviation. Alternately, to keep our notation 
uniform one can invent an asset with price ar< such that -di{xi) gives this warped value (using 
for 0-1 (x) the observed historical volatiUty). But we wish to emphasize that the method we are 
describing works with ANY single-variable warping functions, even using a different one for each 
varmble. The only further substantitive ingredient is the plausibility of usmg JOINTLY normal 
distribnitions, whidi we now discuss. 

The general piroblem is to find a multivariate probabihty distribution for the complete set of 
variables (xi , . . . , x„), or equivalently for their logarithms. In simple financial models generalia- 
mg the Black-Scholes framework, the multivariate distribution of the logvariables is multivariate 
(».e., jointly) normal; see Musiela and Rutkowski's book "Martingale methods in financial mar- 
kets" (1999). This implies that aU portfolios of these logvariables are jointly normal, and can 
alK> be used with other logvariables and portfohos of them to form a jointly normal distribution. 
Thus, if we wish, it is reasonable to use BARRA (or fimctionally equivalent) factors as single 
(log)variables in our model, using, say, individual normal distributions for them based on histor- 
ical volatility. These factors may represent fundamentals of companies or even macroeconomic 
vanables such as interest rates. We do not further discuss such factors, but refer to the book 
of Grmold and Kahn cited above, which also describes how to closely approximate them as 
port;folios of security returns. Our preference is to not use BARRA factors directly, but stay as 
much as possible in the world of optionabic securities, and address questions mvolving BARRA 
factors in terms of approximating portfolios consisting mostly of optionable securities. (But for 
testing and comparisons, it is still useful to be able to include them directly, and we do have 
that capability.) 

Now we certainly do not wish to use only the simple multidunensional Black-Scholes model 
which would not directly allow the nonlognormal input from our single-variable distributions 
based on the options markets. At the same time, option prices on individual assets do not tell 
us anythmg about how assets interact, in part;icular, their correlations. Fortunately, correlar 
tions may be estunated fi:om past (historical) data, and may be viewed as covariances for data 
that has been standardized (has standard deviation 1). Each multivariate normal distribution 
IS detennmed by its mean and covariance matrix. Thus, a natural approach is to use the m- 
dividual distributions to transform or Varp" the variables to standard normal, then impose a 
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multivariate normal structure based on the correlation matrix. This procedure is independent 
of the individual warping functions, which may be different for different individual variables, 
and in particular, can incorporate our market-based option distributions for individual variables 
representing securities with active options markets. A slightly different approach is to use cor- 
relations of the warped variables. This procedure is likely to be more accurate, but may involve 
more computational time. 

We indicate some details. As before, it is notationally convenient to use Vf as a second notation 
for^iCt,Java^ 

correlation matrix of the log variables (In vi , . . . , In whose entries are the cross-corretatians 

_ E(hit;ilni;j) - E(hxvi)E(hiVj) 

Then all diagonal terms pa axe equal to 1, and (7 is a positive semi-definite covariance matrix^ 
which we may here assume to be nonsingular (positive definite). If we use instead correlations 
of warped variables, we have simply 

Pij== E(2/iyy). 

Let us define Fc{yi , . . . , yn) as the cdf of a multivariate Gaussian random variable with mean 
zero and covariance matrix C. Thus, Fc{bi ,bn) is the probability that each variable yi is 
at most some value fr^ There are more elaborate versions, such as i*b(flii • • • ,^1 • jM» 
giving the probability that each yi satisfies < y< < 6t. In the single-variable case these latter 
functions are obtained from the simple cdf by a single subtraction, involving two terms, but the 
corresponding bivariate case involves four terms, and in n dimensions there would be 2^ terxns*. 
However, each of these more elaborate cdf's can be directly computed as an integral, just like 
the simple cdf. Since the more elaborate cdf 's are needed for Monte Ccirlo calculations, possibly 
in high dimensions, it is best to think of them as being computed directly. 

We then define the multivariate cdf 's 
-v -P'Ca^li.-- >a:n) = Fc(2/l(ari ),..., yn(a;„)), 

and 

F{ai , . ^ . ,an\bi, . . , ,bn) = Fc(yi{ai), . . . , yn(an); yi (ti ),■••, yn (^n)) 

where the yi{xi) are the known warping functions for the individual variables. We find it conve- 
nient, with some abuse of language, to speak of F{xi , . . . , Xn) as "the cdf* , even though we have 
all of the above functions in mind, and to use F{xi , . . . , x„) as a proxy for the whole distribution 
(which it does, theoretically, determine). This multivariate cdf then has the following properties: 

• Since the marginals of Fg{zi , . . . , z,^) are Gaussian with mean 0 and variance 1, the mar- 
ginals of F{xi , . . . ,Xn) are equal to N{yi{xi)) = F{xi)\ i.e., they are correct according to 
each smgle-variable model. . 

• If the logvariables (In vi , . . , , In Vn) are actu^dly jointly Gaussian, then the multivariate cdf 
F(ari Xn) is correct. 

In summary, the true joint distribution is approximated by a jointly lognormal distribution 
using historical correlations, combined with warping functions on each variable such that the 
marginal distribution of each variable is correct according to a selected single-variabl model 
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(for example, according to our single-variable model for optionable securities, or according to 
the lognormai model using historical volatility). The single variables may actually be portfoiioSy 
with a default distribution for the portfolio return being lognormai, based on historical volatility. 
This multivELriate theory generalizes both our single-variable theory and standard multivariate 
(log)Gaussian models. It again allows for market input through option prices, to the extent 
that components havis an active option market, but does not exclude nonoptionable securities, 
and also allows portfolios as single variables. In this way BARRA (or functionally equivalent) 
factors are also allowed because of their interpretation_ajS_ portfohos of long and short p ositinng,_ 

4 Applications to portfolios 

Given the multivariate cdf F{xi , . . . , Xn) = Fciyi (xi ), . . . , y„(a:n))» ^ cai* answer many typical . 
questions. We first give an overview, and then take up some of the applications in more detaiL 

As one example, suppose that we want to find the cdf of a portfolio variable 

X = hlXl -\- h2X2 ' " hjiXny 

where the hi are arbitrary coefficients. A simple Monte Carlo method, probably not the fastest, 
is to draw random samples from the jointly Gaussian distribution with cdiFc{yi , . . . , y,^), trans- 
form each yi via the inverse mapping function Xi(y{), and then compute the resulting output 
sample 

X - hixi (yi) + . . . + hnXniVn)' 

After enough samples, we will have an approximation to the cdf of x. More precisely, the 
probability that a < x < 6 is, approximately, the average nmnber of samples yi,...,yn with 
a < hi3C\ (yi ) + . . . + hnXniyn) £ aud this approximation becomes exact in the limit for large 
sample sizes. This works for real portfolios, or for portfolios constructed from a number of assets 
and a residual variable, as might arise firom a regression. Usually the regression is done in the 
log domain, which we discuss below. Note that the Monte Carlo method just described works 
perfectly well if the expression for x above is replaced any function f{xi , . . . , Xn) of the Xj, 
possibly quite nonlinear. 

4.1 Log doncdii portfolios 

In this subsection, we point out how our methods fit with another paradigm in common use in 
the financial communily, and set up some further notation. It is common to work in the return 
domain, or eqmvalently, with logarithms; t.e., 

hi« = ft Inari + •••/?» lnxn» 

Ignoring any possible identification of these variables with those in the previous section, the same 
discussion and Monte Carlo method as above applies, if we regard x as a . nonlinear portfolio 
a; = f{X} , . . -^Xji) =^exp(j(?i Inxj^ + • • Inx^). If the siun B of the fii's is 1, such an x may be 
written X = hixi + h2X2 + • - • + hnXn where hi - fiix/xi. Even if ^ is not 1, incremental changes 
("returns") dbxz computed from this equation for x are consistent with the above expression for 
Inx. It is conomon in the financial conmitmity to think of hi as approximately a constant hi, so 
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that for short periods, where the x^s do not change too much, this equation for z is compaxable 
to the portfolio equation in the previous subsection.^ 

For an asset x not given explicitly in terms of the terms of the Xj, we obtain a similar expression 
via linear regression: 

lnx= /?oIn2:o + /?ihixi + ••• + /^nlna:„, 

The Pi for i ^ 0 are correlation coefficients chosen to minimize the variance of the residual in 
historical data (perhaps subject to constraints, such as jSi > 0 and ft ^ example, 

x might^be an^seciirity'without a^quoted'dptidnT'a^^^ 

which we individually know the probability distributions, in addition to the required correlation 
coefficients for x. We have written the residual term as lnxo (usually thinldng of = 1 and 
the residual as normally distributed).^ The mean of the latter could be nonzero, giving the 
regression "alpha" — a constant term making the mean of the regression correct. Alternatively, 
we could modify the equation to allow an explicit alpha, and keep the residual mean zero. 
Another minor variation might include the addition of a dummy variable with constant return, 
to adjust the value of x up or down. In particular, this gives another way of adjusting the 
residual mean to zero. This equation gives the previous one as a special case if we allow = 0. 

4.1.1 Fast fits of portfolios 

One approach, which promises to be relatively fast computationally, is the following. As in the 
development of cdf-unplied volatilities in Section 2, let us assume that each logvariable InXj 
above is "Gaussian'* with nonconstant variance a\{xifT. In other words, the cdf is given by 
F{xi) = N{''d2{xi,ci{xi),T)). Our aim will be to give F{x) by a similar equation, using some 
kind of fitted curve ai (x). We will assume that we have some class of volatility curves in mind, 
with a small number of parameters which must be determined. 

If the variables In xq , . . . , la Xn were truly jointly Gaussian, then hi x would also be Gaussian. 
Its variemce would be given by the formula 

V9x{hix) ^Y^PiaifHjPjajT, 

where pij is the correlation between Inxj and InXj, and alT ^ Var(lnx{). We therefore define 
the estimate ai{x) of o"i(x) by the conditional expectation 

at (x) = E(fiiai {xi)pij$jai (xj) | hix = lnx<). 

i 

The calculation of the above conditional expectation may be done with Monte Carlo methoids. 
In the language of nonlinear portfolios above, we would take the function /(xj , . . . , Xn) to be 
0 outside a thin multidimensional solid enclosing the hyperplane defined by Inv = ^/7<lnv{). 
Inside the solid we would take /(xi , . . . , x„) equal to the above expression for Var(ln x), divided 
by the probability of being in the solid (also a Monte Carlo calculation). In terms of samples, we 
just take the average of Var(Inx) over all the samples that end up inside the thin solid. However, 

^Thus hi^^ = = PidhiXi, so that for small changes dxi the change dx from the first equation is 

approodmately the same as would be obtained from the second. However, this relationship requires *^ehaIanciD^ 
to remain a good approximation for long^ periods. 

'For the residual term t = 0, we can use a constant variance, or impose some generic nonconstant structure 
based on observed behavior. 
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it is not necessary to compute all values of iTi(x), but only enough to fit the parameters for the 
volatility curves we are using. 

The estimated mean of In x would he\ns* ~^ai (x)^ , with s* determined as before, or replaced 
with some risk-averse estimate, to obtain the risk-averse or ^'tnie" distribution. (It is common, 
incidentally, to use factor models such as these to estimate a risk-averse version of hid* s 
J^iPi ^ risk-averse values of s^.) 

Also, we mention here one useful variation: We may prefer not to view the residual term 

/?o-lnxo-as part of4he modelrand-instead-write=down"a=ioint'pdf onl^^ 

case we can use the double expectation _ 

ai(x) = E(E(2i9iai(a;i)p<i^i<ri(xj) | hix = X)^*^^*))' 

where the inner expectation is with respect to the variables a;],X2,...tX„9 and the onta exrh. 
pectation is with respect to the residual. We might take the standard deviation a{xQ) of theis 
residual (taking fio =^ I) as & constant, determined histoncally, or make an estimate based onr 
some leverage model. 

Now we can estimate the cdf F(x) by 

as in the univariate case. To summarize, we use our multivariate model to determine parameters 
for a univariate model of the portfolio. After that is done, we can obtain probabilitieis for the- 
portfolio without, having to go back to the multivariate model, thus achieving a savings in 
time. We could take this one step further and think of randomly generating values of ai{XfTjr 
independently of any Monte Carlo philosophy (but perhaps still throwing away valu^ of x too 
far out-of-the-money), and then using the values obtained to do the regression required in the 
Dumas-Fleming-Whaley approadi. 

5 "What-if questions 

The multivariate distribution lends itself to the study of many questions regarding conditicmal 
probabilities. For example, suppose that we want to know the effect of the increase or decrease of 
some segment of the market on a portfolio, or the increase or decrease of some macro-economic - 
fioctor. BARRA, following earlier ideas of Ross, has viewed such macro-economic factOTS as* 
portfolios with both long and short positions. Similarly, BARRA considers market segments 
associated to price-to-eamings ratios and other fundamental parameters, as well as to industry 
groupings, as portfolios. (See the book of Grinold-Kahn cited above.) Thus, we are led supply 
to consider the effect of one portfolio on another. 

For definiteness, let us suppose the first portfolio is x, where as above 

In X = ^0 1^^2:0 + /?i Inxi + i^n lnxn» 
and the second portfolio is y, where 

]ny= Tolnyo +7i Inxi + -•• + 7rtinx„. 

We take = 7o = Ij and view Inxo and e = In 1/0 as residuals with mean 0. The latter residual 
is not assumed to be a factor in our multivariate model. Consider the following typl^^ 'Vhat-if* ' 
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question: Let A and B be given positive constants. If we know a; > A at time T, what is th 
probability that y > ^ at time Tl We give two approaches to this problem, the first probably 
quicker, but possibly not as accurate, using a regression to avoid at least some Monte Carlo 
calculations, 

5.1 ''Wiafc-if An approach involving part rcgiressian, pOTt IVfccube Oaio 

^ We have hiy > hi^ iff Iny ~e > hi ^ -g. All correlations pg between hiXj and Inxj ^e asgui ned 
known. We may also assume that we have historical values of volatilities tri = y^ar(lnxi). 
(Alternatively, we could estimate such values as expected values of implied" volatilities, but it 
would not be difficult to maintain an inventory of historical values, and more in the spirit of 
this part of the calculation to do so.) Thus we can esthnate the historical covariances between 
Inxandlny— 6: 

i4 

- as well as cTjnx = ^ Var(hix), ajn y-e = \/V&i{hiy — e) and the correlation : 

f . Cov(hix,lny — e) 
Pl^; = An^jDW-c = • 

This gives a standard regression for the variable In y — e expressed in standard deviations from 
its mean, in terms of a siioilaxly standardized expression for In x. Note that e has mean 0 
construction- Put d2(5*,x,a) = H^^M^V^ , Thus -£/2 (4, a:, aim) measures standardized In x 
using historical volatility, and -d^^x) = -€^2(5^, x, d'i(x)) measures "standardized" (warped) Inx 
using the cdf-implied volatility curve di (x), as discussed in the previous section. Here 3% denotes 
our best estimate for the value of x at time T. 

Let o-i(y,e) denote the volatility curve associated with Iny — which may be estimated ais 
in the previous section (or computed from estimates of a\{y) £uid the standard deviation of 
the residual, if we are willing to view the residual as uncorrelated with Iny — e, as is guaran- 
teed in unconstrained regression). Put d2(y,e) = <fc(«J,ye-«,fri(y,e)), so that -d2(y,e) is a 
"standardized" measure of Iny — e. Then the standard regression appropriate to our model is 

-4(yie)=p(e)(-d2(a:)). 

There is a residual associated with this regression, which we have not written down. It is pre- 
sumably normal, and its variance may be computed. For notational reasons we will just imagine 
it has been incorporated into the original c. As is apparent from the form of the expressions 
in the display, an alternative to the above regression is to do it with the warped correlation 
coefficients suggested in the previous section. If, in addition, it was appropriate to view the 
original portfolios as linear combinations of warped variables (our standard normal marginals), 
the regression above could be done without any recourse to Monte Carlo calculations. Simi- 
lar remarks would apply if we used constant historical volatility functions throughout, though 
presumably the latter procedure would lose accuracy. 

In any case, we can now answer our **what-if" question as a simple expectation in the univariate 
normal distribution of the (adjusted) residual e. Abbreviate di{s%,A,a\{A)) to di{A) and 
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rf2 (5*, Be"**^, ai (jB, e)) to (k(Bt e). Assume p (e) > 0 (the natural case of a positive cprrdatiaD). 
Then we have 

FT{y>B\x>A} = E{FT{-d2{y,s)>-d2iB,e)\-M^)>MA)) 

= E(PrM2(tr)>p(erH-(i2(5,e))|-d2(x)>-d2(A))) 
= E{mm{hN{p{er'{MB.e))/N(<h(A))}). 

The first eguation follows i ust becau se —dilv.e ) is monotonicall y Increasmg as a function of y: 
that is, the condition that y>B is completely equivalent to the condition — ^2 (y , e) > — <^ {B^ e). 
Similar remarks hold for the condition x> Ay while the expression Pr{y > B\^'X> A ] just means 
the probabihty that the condition y>B holds when it is known that x>A. The second equar 
tion is then derived with the displayed expression above for — d2(yie)* (K p(e) is negative, the 
inequality involving its inverse reverses.) This inner expectation is then calculated in the nonnal 
distribution. For values of e for which -ckiA)) is as large as p(c)"'(-d2(5,e)), the expectar 
tion is a certainty, and yields the value 1. When -dz{A)) is smaller than p(e)''^(-d2(Bie))» 
its cumulative normal distribution value N{-di{A)) is smaller than N{p{e)'~^{-di{Bje)\ and 
the probabihty 1 - -/V(~rf2 = ^^i^ii^)) that the standard normal variable z = —^2(3?) 
is at least —d^iA) is smaller than the corresponding probabihty 1 - N{p{e)^^{—d/2{B^e)) = 
JV{p(£)-* (dfe(B, £)) that z be at least p(e)-i {-di{B\ e). The ratio N{p{e)-^ (d2{B,e))/iV(d!i(il)), 
which is the desired inner expectation, is thus smaller than 1, as is appropriate for a probabil- 
ity, conditional or not. If p (f) is negative, similar reasoning leads ixistead to the expression 
E(max{0, iN(d2{A)) - N{p{e)'^ [d2{B,e)))/N{d2{A))}) for the desired conditional probabffity. 
Although the final answer m either case is an expectation (over c), it is essentially an integral 
that could be computed quickly with power series. (A very simple and accurate power-seriea 
expansion of N{z) is given on p. 252 of the book by HuD cited above.) Using that, one could de- 
termine by iterative methods what value of c makes, say, the ratio N{p{£)~^ (d^iB, e))/N{d2 {A)) 
equal to 1, and then integrate the ratio against the standard normal pdf &pm —00 to the deter- 
mined value of e,in the p(e) > 0 case. Similar remarks apply if p(e) < 0. (Note that, if p(e) = 0, 
the variables In x and In y are uncorrelated, and the conditional probability Pr{y > B\x>A} 
is the same as the unconditional probability Pr{y > B},) 

All of the latter calculations can be done very fast* Of course, we have akeady used some 
Monte Carlo calculations to get this far, unless we are in the simplified context of constant 
volatility functions. 

5.2 ''VWhab-if': Ihe ftJl Mbnte 

It is easy to say how we would compute an answer to the same "what*if* question, using our 
fiill joint probability distribution. We simply write 

Pr{y > B|x > A} = E(Pr(-d2(y,e) > -<i2(B,e) I -d2(x) > -e«2(>l)) 

and interpret Inx in —d2{x), and hiy — e in -^2(2/)^) terms of their expansions in Inxo* 
Inxi, . . . , Inxn- To compute, say the inner expectation by a Monte Carlo calculation, we would 
generate a large number of random samples of multivariate standard normal vectors z with 
covariance matrix C We then take the average, over the samples z which happen to satisl^ 
z > -^2 (A), of the function which is 1 when -^2 (y , e) > -di (B, e) and 0 otherwise. We have not 
experimented to see whether this method yields better answers than the regression procedure 
above. Nevertheless, it illustrates how we could approach more sophisticated "what-iP questions 
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that could not be easily treated by regressions. For example, suppose we believe that factor w 
will remain in a range C < w < and ask the same question about y, subject to the same 
condition on X. This is hard to formulate in terms of regression, . and is simply not possible in 
terms of single-factor regression. However, it is easy to answer with the fiill distribution: 

?T{y>B\x>A,C<w<D}=- 

E(Pr(-d2(y.e) > -^MB,e) \ -(£2(2;) > (h{A)). -diiC) < -d^iw) < -d^iD)). 

Finally, we may liot want^to worK'inTthe log ddroainTwhicfi^if 
X = hixi + /12X2 + • • • + hnXn would force us into an approximation, as noted*=.But, working with 
the full distribution, we can phrase a condition x > A as /iia:i(yi) + . . . + hnXniVn) > A, in the 
language of the first section where the vector of 1/5 plays the role of our vector z here. Monte 
Carlo calculations can now proceed as befiore, using log domain expressions or not for the other 
conditions. 

6 "You gotta believe" questions 

In the previous section we were focusing on an investor thinking about the value of his or her 
portfolio y in response to the change in a factor x. Conversely, an investor might want to kiiow 
what the investment world looks like if a given stock or index y goes to a certain level B at time 
T. What is the expected value A at time T of another portfolio x, or simply of one of the foctors 
X{? Our main plan is, upon input by the user that y is going to level By to list several assets Xi 
or factors/indices x most highly correlated with y and their expected values with y at B. 

It would also be possible to display a confidence interval for each selected asset or factor, 
and have other information about its new projected probability distribution readily available. 
We could also offer comparisons with the old projected probability distribution of x, where no 
assumptions on y is made. Finally, in some cases, where it was possible to explain much of 
the variance of y with just a few X{ (appearing in the regression of y), we could list percentage 
increases/decreases of a portfolio of these x< reqiiired to make B the expected value of y, based 
solely on its dependence on this portfolio. (For example, the coefficients in the portfolio could 
come from the regression of y with respect to all the x<, or some new regression might be done, 
perhaps allowing user-defined constraints). It should be mentioned that medians or modes 
are alternatives to expected values (means) here and above; in any case users will need to be 
educated about the fact that the median and mode difiEer systematically from the mean in near 
lognormal distributions. 

The main problem might be viewed as understanding the probability distribution of x, given 
that y>B at a given time T, with x and y as in the previous section. This can be approached 
by the methods of the previous sections, by reversing the roles of the variables. 

There is, however, a simpler question that can be treated in an espedaUy quidc way. Consider 
the problem of determining the mean of x conditioned on the equality y = B at time T. The 
idea is to use simple regression methods, but interpret answers as measured in terms of our 
variable volatilities. In our previous notation, we have a r^ession 

-dzix) = p ■ (-dj(y,e)) + v 

where p (which we called p{e) in the previous section) is the historically determined correlation 
between In x and the random variable In y — ^ . Note that the roles of dependent and independent 
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variable are reversed. There is also a residual v, which has mean 0 here, and plays n role (gets 
averaged away). Thus, the desired conditional expected value i4 of x is obtained from 

Recall that a\ (y, e) is an estimate, obtained by Monte Carlo methods, of the implied volatility 
a\ associated to the random veiriable In y >t e. For faster but less accurate calculations it can be 
estimated historically as Y!n^Pi^iPijPj^j with each of the it's, )J's, and p*s here given histori- 
caliyr (SeB^the previous~sectiorfi^^tIti^)^SiMlSl^fdrf;^'^^Stlions, — (x)^^^d use 
historical volatility, though we expect it to be given more accurately, or rath^, more accuiately 
according to the market view, as -d2(x) = ~<fc(5j,x,ai (x)), usmg the impUed volatility func- 
tion estunate Si (x). If x = Xt is a single asset or index in our model, then ai (x) = ai{xi) does 
not require a Monte Carlo estimate, but is presumably already available. 

To summarize, the conditional expected values required to answer '^u gotta believe" questions 
are easily obtained by regression methods. The accuracy of such answers is enhanced, or at 
least shaped more to reflect market input, when all logvariables are measured in "standard 
deviations," interpreted as our variable volatilities, 

7 Portfolios containing option securities 

We conclude this document by briefly pointing out that our methods, when usmg full Monte 
Carlo calciilations, easily apply to portfolios containing option securities. The well-known idea 
is to think of an option as as a kind of nonlinear portfolio — a quadratic one, to be more 
precise. Thus, an option on a single underlying security with underl3dng price xi has a price 
approximately X = c+A(xi-«i)+(l/2)r(xi-»i)^ forxi near 5i , where the option was evaluated 
to a known value c. Here A and T are well-known parameters in the options markets, giving 
the first and second derivatives of the option price at si with respect to the underlying security 
price xi. Perhaps the most diaracteristic feature of options is that they have nonzero F — their 
proportion of increase or decrease with respect to the underljring security price changes as the 
security price changes. Explicit formulas in terms of other standard parameters are available, 
say, in the Black-Scholes theory for both A and T (see the Hull book cited above). Such formulas 
could be obtained by differentiation directly in other theories or when using empirically-fitted 
curves. In any case, once we have such an explicit approximation to x, its probability distribution 
is easily given by the Monte Carlo methods of Subsection 3.1 above. The same method applies 
as well to portfolios containing several options and other securities. 
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CLAIMS 

1 L A method comprising: 

2 receiving data representing current prices ofoptions on a given 
' 3 asset, 

4 deriving from said data an estimate of a corresponding implied 

5 probability distribution of the price of said asset at a future time, and 

6 making information about said probability distribution available 
. 7 within a time frame that is useful to investors. 

12. The method of claim 1 in which the data represent a finite number 

2 of prices of options at spaced-apart strike prices of the asset, and also 

3 including 

4 calculating a set of first differences of said finite number of prices 

5 to form an estimate of the cimiulative probability distribution of the price 

6 of said asset at a future time. 

1 3. The method of claim 2 also including • 

2 calculating a set of second differences of the finite number of 

3 strike prices from the set of first diflferences to form an estimate of the 

4 probability distribution fimction of the price of said asset at a future time. 

1 

1 4. A method comprising: 

2 receiving data representing current prices of options on a given 

3 asset, 
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4 deriving from said data an estimate of a corresponding implied 

5 probability distribution of the price of said asset at a future time, and 

6 providing a real-time data feed containing information based on 

7 said prohfthility distrihutinn. — = ■ 

1 5. A method comprising: 

2 providing a graphical user interface for viewing pages containing 

3 financial information related to an asset; and 

4 when a user iudicates an asset of interest, displaying probability 

5 information related to the price of the asset at a future time. 

1 6. A method comprising: 

2 enabling a user to identify an asset of interest, the asset being one 

3 for which data representing current prices of options on the asset are 

4 available, 

5 deriving from said data an estimate of a conrespondmg implied 

6 ' probabUitydistributioiiofthepriceofsaidaissetatafiituretiine, and 

7 providing a display of a probability distribution of prices of the 

8 asset at future times. 

1 7. A method comprising: 

2 enabling a user to indicate a fiiture time and to identify an asset of 

3 interest, the asset being one for which data representing current prices of 

4 options on the asset are available, and 

5 displaying to the user a distribution of the probability that the asset 

6 will reach prices within a range of prices at the future time. 
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1 8. A method comprising: 

2 receiving data representing current prices of options on a given 

3 asset, the options being associated with spaced-apart strike prices of the 

4 asset at a future time , = ^=.^_— = 

5 the data including shifted current prices ofoptions resulting from a 

6 shifted underlying price of the asset, the amount by which the asset price 

7 has shifted being different from the amount by which the strike prices are 
.8 spaced apart, and 

9 deriving from said data an estimate of a quantized implied 

10 probability distribution of the price of said asset at a future time, the 

1 1 elements of the quantized probability distribution being more finely 

12 spaced than for a probability distribution derived without the shifted 

13 . current price data, 

1 9. A method comprising 

2 receiving data representing current prices of options on a given 

3 asset, the options being associated with spaced-apart strike prices of ttie; 

4 asset at a future time, 

5 deriving from said data an estimate of an implied probabihty 

6 distribution of the price of said asset at a friture time, the mathematical 

7 derivation including a smoothing operation, and 

8 making information about said probability distribution available! 

9 within a time frame that is useful to investors. 

1 10. The method of claim 9 in which the smoothing operation is 

2 performed in a volatility domain. 
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1 II. The method of claim 9 in which the smoothing operation is 

2 performed in the domain of the option prices or in the domain of the 

3 probability distribution information. 

=1 12 ^A.method.comprising:^ ■ ^ ■ = ■ — 

2 ' receiving data representing current prices of options on a given 

3 asset, the options having strike prices at future dates, 

4 deriving a volatility for each of the future dates in accordance with 

5 a predetermined option pricing formula that links option prices with strike 
^ pricesof the asset; 

7 generating a smoothed and extrapolated volatility function; 

8 and using the volatility information to generate information within 

9 a time-frame that is useful for investors. 

> 

1 13. The method of claim 1 2 in which the volatility function is 

2 extrapolated to a wider range of dates than the future dates. 

1 14- Tlie method of claim 1 2 in which the volatility function is 

2 extrapolated to strike prices other than the strike prices of the options. 

1 15. The rnethod of claim 9 also including 

2 generating a smoothed volatility function using only data that are 

3 reUable under a predetermined measure of reliability. 

1 16. The method of claim 9, further comprising: 

2 generating an implied volatility function formula having a 

3 quadratic form with two variables representing a strike price and an 

4 expiration date; 
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5 wherein coefficients of the implied volatility function fommla are 

6 detennined by applying regression analysis to approximately fit the 

7 implied volatility fimction formula to each of the implied volatilities. 

R 17, A m ethod com prisin g: 

9 receiving data representing current prices of options on assets 

10 belonging to a portfolio, 

1 1 deriving from said data an estimate of an implied multivariate 

1 2 distribution of the price of a quantity at a futme time that depends on the 

13 assets belonging to the portfolio, and 

14 making information about said probability distribution available 

1 5 within a time fraine that is useful to investors. 

1 18. A method comprising: 

2 receiving data representing values of a set of factors that influence 

3 a composite value, 

4 deriving from said data an estimate of an implied multivariate 

5 distribution of the price of a quantity at a future time that depends on 

6 assets belonging to a portfolio, and 

7 making information about said probability distribution available 

8 within a time frame that is useful to investors. 

1 19. The method of claim 1 8 in which the mathematical derivation 

2 includes generating a multivariate probability distribution fimction based 

3 on correlations among the factors. 

1 20. A graphical user interface comprising: 
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a user interface element adapted to enable a user to indicate a 
future time; 

a user interface element adapted to show a current price of an 
asset~-and •■ 

a user interface element adapted to show the probability 
distribution of the price of the asset at the future time. 

21 . A method comprising: 

continually generating ciurent data that contains probabiUty 
distributions of prices of assets at future times, 

continually feeding the cun-ent data to a recipient electronically, 

and 

the recipient using the fed data for services provided to users. 

22. A method comprising: 

receiving data representing current prices of options on assets 
belonging to a portfolio, 

receiving data representing cmrent prices of market transactions 
associated with a second portfolio of assets, and 

providing information electronically on the probability that the 
second portfolio of assets will reach a fu*st value given the condition that 
the j5rst portfolio of assets reaches a specified price at a future time. 

23. A method comprising: 



2 



receiving data representative of actual market transactions 
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3 associated with a first portfolio of assets; 

4 receiving data representative of actual market transactions 

5 associated with a second portfolio of assets; 

6 providmg information on the expectation value of ffie^ce"of fiBT 

7 portfolio ofassets given the condition that the second portfolio of assets 

8 reaches a first specified price at a specified fiiture time through a networic. 

1 

1 24. A method comprising 

2 evaluating an event defined by a first multivariate expression that 

3 represents a combination of macroeconomic variables at a time and 

4 estimating the probability that a second multivariate expression 

5 that represents a combination of values of assets of a portfolio will have a 

6 value greater than a constant B at time T if the value of the first 

7 multivariate expression is greater than a constant A. 

1 25. The method of claim 24 iii which the probability is estimated using 

2 Monte Carlo techniques. 

1 26. A method comprising 

2 defining a regression expression that relates the value of one 

3 variable representing a combination of macroeconomic variables at time T 

4 to a second variable at time T that represents a combination of assets of a 

5 portfolio, and 

6 estimating the probability that the second variable will have a 

7 value greater than a constant B at time T if the value of the first variable is 

8 greater than a constant A at time T, based on the ratio of the probability of 
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9 X being greater than A under the regression expression and the probability 

10 of X being greater than A. 

1 27. A method comprising 

^1 definmg a current valueof an option as a quadratic expfessi'^tBir^ 

3 depends on the difference between the current price of the option and the 

4 current price ofthe underlying security, and 

5 using Monte Carlo techniques to estimate a probability distribution 

6 ofthe value at a future time T of a portfolio that includes the option. 

1 28. A method comprising 

2 displaying to a user a circular visualization element having sectors 

3 arranged around a center ofthe element, the sectors respectively 

4 corresponding to different groups of assets, 

5 in each ofthe sectors, displaying an array of visual elements representative 

6 of respective assets belonging to the group to which the sector 

7 coiresponds, the visual elements being arrayed with respect to distance 

8 from the center in accordance with magnitudes of performance of the 

9 assets during a recent p^od. 

1 29. The method of claim 28 in which the visual elements comprise 

2 displayed dots, one for each of the assets. 

1 30. The method ofclaim 28 in which the visual elements exhibit 

2 visible characteristics that correspond to categories of the assets within the 

3 group- 

1 31. The method ofclaim 30 in which the categories ofthe assets 

2 within the group correspond to different capitalizations. 

: _ A% 



1 32. The method of claim 29 in which dots are arranged along a radius 

2 of the sector to which they belong. 

1 33. The method ofclaim 32 m which dots that would otherwise lie on 

2 the radius at a given distance from the center are displayed at different 

3 angular positions near to the radius. 

1 34. The method of claim 28 in which the sectors have angular extents 

2 that represent the fractions of the total number of asset items represented 

3 by the respective sectors. 

1 35. The method of claim 28 in which the circular visualization element 

2 is subdivided into rings having respectively different distances from the 

3 center. 

1 36. The method of claim 35 in which the rings are di^layed in 

2 different colors. 

1 37. The method of claim 28 in which the magnitudes of performance 

2 of the assets are measured in percentage price change. 

1 38. The method ofclaim 28 in which the recent period comprises a 

2 trading day on an asset market. 

1 39. The method of claim 28 in which the assets comprise securities 

2 issued by corporations. 

1 40. A displayed visualization element that 

2 iscutular, 

3 has sectors arranged around a center of the element, the sectors 

4 respectively corresponding to different groups of securities issued by 

5 corporations. 
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6 in each sector, has an array of dots representing respective securities 

7 belonging to the group to which the sector conesponds, each of the dots 

8 lying on or near a radius of the sector and each having a distance from the 

9 center along the radius that corresponds to the percentage change m the 
TO pri^Wthe representid^ecurityllmnpr^ding^a^ - 

11 has differently colored rings at respectively different distances from the 

12 center. 

1 41. A method comprising 

2 displaying to a user a visualization element that indicates the odds 

3 of a performance measure of an asset being within specified ranges of 

4 identified values of the performance measure at a succession of times in 

5 thefiiture. 

1 42. The method of claim 41 in which the performance measure 

2 comprises a price of the asset. 

1 43. The method of claim 41 in which the performance measure 

2 comprises a return percentage. ; ' 

1 44. The method of claim 41 in which the performance measure 

2 comprises a tax-adjusted retum percentage. 

1 45. The method of claim 41 in which the visualization element include 

2 stripes superimposed on a graph of the pedbrmance measure over time, 

3 each ofthe stripes representing one of the specified ranges. 

1 46. The method of claim 45 in which each of the stripes begins at a 

2 current time and becomes broader as it extends to future times. 

1 47. The metiiod of claim 41 also includmg 



50 



wo 02/17120 



PCT/USOl/25753 



2 displaying a gr^hical device that shows actual historical values of 

3 the performance measure. 

1 48. The method of claim 47 in which the graphical device that shows 

2 actual historical values is a line graph one end of which joins the 

3 visualization element at a point which represents a current date. 

1 49. The method of claim 41 in which the visualization element 

2 includes two portions, one of the portions representing the odds prior to a 

3 specified date based on one assumption, the other of the portions 

4 representing the od|ds ailer the specified date based on another aissiunption. 

1 50. The method of claim 49 in which the specified date is a date oil 

2 which tax effects change fi-om the one assumption to the other assumption. 

1 51. A method comprising 

2 displaying to a user a visualization element having graphical 

3 indicators of the relative performance of a selected asset compared with 

4 the perfonnance of groups of assets in each of a succession of time 

■ 5 periods, each of the groups comprising assets representing a common > 

6 style. 

1 52. The method of claim 5 1 in which the style comprises a class of 

2 investment objectives. 

1 53. The method of claim 5 1 in which the relative perfomiance is 

2 determined using an asset class factor model. 

1 
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